Fluid Concept Architecture: A Critical Evaluation 



Zelfstandig Studieproject 
Joaquin Vanschoren, 2CW 

June 2004 
Abstract 

Although many AI programs perform some impressive feats, they usually don't have 
a clue about exactly what they are doing, depending on humans to do the perception of 
the problem for them. Complex architectures using 'fluid concepts' try to remedy this 
by an in-depth treatment of concepts and by incorporating perception as a fundamental 
part of problem solving. This paper emphasizes on understanding how these architectures 
work, what their (dis) advantages are, and how they relate to other approaches in artificial 
intelligence. 

1 Introduction: Concepts slipping under pressure 

Last summer, some friends and I were sitting in the lawn at the university campus. It was 
quite warm, some of us had taken off their shoes, and someone brought a laptop which 
was playing some music in the background. Suddenly someone wanted to change the song, 
so he leaned over to the laptop, grabbed a shoe lying right next to it, moved it around 
a while, and unwittingly said to the owner "That's strange, your mouse isn't working!". 
This funny anecdote illustrates one of the most fundamental characteristics of concepts as 
we use them in everyday life. In my friend's mind, the concept 'mouse' slipped unnoticed 
into the concept 'shoe'. However, some pressure was needed (in this case the lack of the 
mouse he was expecting to be there), and the concepts had to be somewhat similar (I 
guess the tip of the shoe had somewhat the same shape and feel to it). Of course, this is 
quite a dramatic example. My friend probably also let this 'laptop computer' slip into his 
more familiar 'desktop computer', since he was expecting a mouse, and laptops lying on 
the lawn usually don't have any. This slippage is much more obvious, since both concepts 
are closely linked, and a much smaller pressure is needed. These minor slippages occur 
every day, over and over again, and are inherent to the way we think. It only gets funny 
when unexpected, large slippages occur. 

Douglas R. Hofstadter and his graduate students have been working for a long time 
now on a cognitive model that incorporates, among other things, these characteristics of 
concepts into a dynamic whole [8]. Their work has led to a refreshingly new, but complex 
architecture, that is capable of performing its own perception, and that uses this percep- 
tion to look for an answer in a focused way. Instead of using a set of worked-out rules 
that are all carried out sequentially, tiny agents try to make a perception out of a group 
of concepts, by linking them with each other (biased by these concepts' connotations), 
while working in parallel. This will create a drive that guides the program towards what 
seems to be, according to the program, a plausible solution. To give a flavor of the way 
this works, imagine a column of ants, moving through a forest. No one tells the column 
where to go, or which direction to try, but a legion of scout ants tries different directions 
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in parallel, and through the play of pheromones, a general consensus and drive emerges, 
leading the colony the right way. Concepts are somewhat like sugar for these ants, but 
the ants themselves decide, depending on the context, which sugar trails they want to 
follow. 

Starting from this general feeling, we will get up close and personal with the basic 
architecture in Section [2] Once we've understood the mechanisms underlying the model, 
we can discuss it's merits and drawbacks in Section [3] Finally, in Section [4] we can widen 
our view and contrast our findings with other approaches in AI and cognitive science. 

2 The architecture 

Due to the emergent behavior of the program, it might become very unclear which aspects 
are actually implemented, and which are emergent. Therefore, we shall first discuss the 
actual basic components individually, before explaining how they work together. Also, 
to emphasize the intended generality of the architecture, it would be instructive to strip 
these components from their various implementations. One specific project however, the 
intriguing Copycat-project, which can safely be considered the mother of this architec- 
ture, shall be used as a running example, elucidating the model's mechanisms. Once we 
understand Copycat, we can look into Metacat, its successor. Metacat carries a natural 
architectural extension, which can be most easily explained after grasping the basics. We 
shall now turn to a short description of these projects, and the well defined micro-domain 
they share. 

2.1 Running example: the Copycat project 

Copycat [5J ch.5] is a computer program tackling analogy making in an effective, psycho- 
logically realistic way. Given the vastness of such an endeavor, it might easily have grown 
out of proportions. Therefore, Copycat was born with a small, but well defined domain, 
involving only letter-strings. Here is an example, consider it for a while before reading 
any further: 

Suppose the letter-string aabc were changed to aabd, 

how would you change the letter-string ijkk in "the same way"? 

There is more to this domain than meets the eye. You probably noticed the successorship- 
relation and came up with ijkl right away, but you also might feel quite uncomfortable 
with disregarding the doubled k. Keeping this in mind and grouping these letters, ijll 
could be a better solution. But still, the doubled a is ignored. A solution that quickly 
slips into mind is grouping the doubled letters, and noticing that these groups are at 
opposite positions. So, it would be reasonable to take the successor of the leftmost letter, 
instead of that of the rightmost one. This yields jjkk. But still, something feels wrong, 
since the successor-relationship also has an opposite, which we didn't exploit. Taking the 
predecessor of the leftmost letter, we finally arrive at hjkk... 

I hope you enjoyed the 'pressures' you felt while perceiving this problem. I also hope 
you noticed how these pressures made you slip from one concept into a closely related one, 
such as 'letter' into 'letter group' or 'successor' into 'predecessor'. This should give a good 
idea about the fluidity of concepts. Copycat will try to take advantage of this fluidity in 
perceiving the problem, and the pressures arising will guide the program in looking for a 
solution. Whereas Copycat restricted itself to modeling only the subcognitive processes of 
analogy making, Metacat extends this and introduces extensive self-watching mechanisms 
to increase the effectiveness of the program. I'll discuss these later on. 



2 



[elaboration + evaluation] 



t has screen, calculates, plays music, used on desk has screen, calculates, plays music, used in grdss 

\ '. desktop computer/ => laptop computer 

[source] "^y t^T [target] 

systerti-> . 1 1 ; [recognition] ° 

pointing device 

I \ mouse ; => shoe 

! — close to pc screen, fits in palm, lies on desk close to pc screen, fits in palm, lies in grass 



[transfer] 

Figure 1: Analogy-making in the shoe-example. 



2.1.1 The structure of analogy 

The driving force for Copycat's focusing on analogy making is the belief that perceiving 
similarities (an essential part of analogy making) lies at the crux of intelligence. Also, it's 
hard not to notice the ubiquity of analogy making in everyday life. Recognizing faces, an- 
imals, sounds, (music) styles and situations, handling problems that are vaguely familiar, 
learning by relating to previous examples, all the way up to being creative in a somewhat 
rational way, all this would be very hard, if not impossible, without the ability to make 
analogies. In any case, you must perceive two non-identical objects or situations as being 
the "same" at some abstract level [T7]. One might argue that letter-strings are way too 
primitive entities, that have little to do with real-world analogy making. Whereas a de- 
tailed discussion of the generality and applicability of the model will follow after having 
understood the basic mechanisms, I would first like to explain how letter-strings can still 
bear many aspects involved in general analogy making. I believe that this understanding 
will be important in a correct judgment of the mechanisms that are to be explained shortly. 

Analogy making is mostly envisaged as a mapping between a source- and a target- 
domain. According to Hall [3|5], this encompasses 4 abstract processes: recognition of 
a source given the target, elaboration of a possible mapping between the two, evaluation 
of this mapping, and a transfer of information from source to target. Let's illustrate 
this reflecting on what happened last summer. You see, my friend also unwittingly made 
an analogy. When he wanted to change the song, he focused on the laptop [target]. 
This reminded him [recognition] of his desktop computer [source]. Both have screens, 
computing power and the ability to play audio files [elaboration] . There is however also a 
small difference. For example, desktops mostly sit on a desk, while this laptop was lying 
in the grass. But still, helped with a pinch of absent-mindedness, the mapping seems 
plausible enough [evaluation]. Having the picture of a desktop in mind, he knows he has 
to find some pointing device to change the song. The picture of a mouse enters the scene. 
Back in the real world, he vaguely notices this mouse-shaped thing lying right next to the 
computer screen (or he just acted blindly, and this information entered his mind through 
his hand). Since they both fit well into his palm, and are at somewhat the expected 
position, he concluded [transfer] he was dealing with a mouse. The situation is depicted 
in figure 1. An analogy between 'mouse' and 'touch pad' would, of course, have been 
more likely, probably gone by unnoticed, being one of the thousands of good analogies we 
make every day. 

Figure 2 is a snapshot from an actual run of Metacat. The program was given the same 
problem we considered previously. Notice that the program got the recognition-phase for 
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free (it doesn't have to perform it). This phase is simply not included in Copycat. Doing 
this would mean the program should also automatically 'see' analogy-problems. Humans 
do this all the time, mostly based on input that wasn't what they expected, but closely 
resembled something else. Although recognizing conceptual closeness is a basic quality 
of the program, this probably would require additional architecture, which is currently 
beyond the scope of the project. The other aspects however, are all present. The different 
letters of a string are an abstracted representation of different comparable characteristics 
of objects under analogy. Without these underlying similarities, analogies would be im- 
possible. As we shall sec in a minute, these letters are part of a small, but rich world with 
a reasonable number of relations and meta-relations. The same goes for the characteris- 
tics of our human-mind objects, although their world is much larger, and relations can 
be thought up and justified just by looking for them, even if the relation wasn't in mem- 
ory previously. The elaboration 'bridges' are also present, meaning the program found 
a relation between the characteristics of the different objects (strings), like 'similar' or 
'successor'. The evaluation happened in the background, depending on how far-fetched 
or convincing they are (a bridge between two similar groups is pretty convincing). Fi- 
nally, the system found a convincing mapping between the two source strings, and has 
transferred the horizontal mapping accordingly. The dotted lines are some structures the 
program was trying in parallel when it found the solution. 

Still, the use of letters instead of general real-world objects seems to cover up another 
aspect, namely the (low-level) recognition of these objects. Seeing a letter-like shape 
doesn't mean you recognize the letter. Generally speaking, fluid concept architecture 
does not do any low-level perception, only high-level perception (starting from the basic 
recognition of low-level elements, such as letters). We'll discuss this in detail in section 
3. In Copycat, the program is told what a letter stands for (this thing is a letter 'a'), 
and what its neighboring things are. Still, there is a lot of work left in perceiving the 
letter strings (seeing it as a group of successive letters for example). Telling the program 
which letter corresponds with a given token is certainly more simple than describing 
a real-world object. But, just as a letter-string is described only by its letters (and 
letting the program do the rest of the recognition) , a real- world object would also only be 
described by its most readily observed characteristics. An object is never fully described, 
but the program uses these readily observed characteristics to make suggestions about 
what the object is like, and what its most important characteristics are in the given 
context. This approach is fundamentally different from other analogy- making in AI, 
which demands their programmers to fully describe the object to the program, and to 
decide which characteristics are most important. This must all sound a little fuzzy, but 
I'll come back to this in section 3. First, on to the architecture! 

2.2 Three not so easy pieces 

We are now fully equipped for venturing into the depths of the architecture. In order 
to be able to discuss and evaluate the mechanisms it brings forth, a rather profound 
understanding of its structure is needed. However, I will go no further than necessary for 
the continuation of this paper. For those who want to know all the ins and outs, Mitchell 
|16j is the perfect guide. We'll now turn right away to the three major components of the 
architecture. 

2.2.1 The Slipnet - a dynamic semantic network 

The Slipnet could be envisaged as our long-term memory. It is a network of concepts 
(nodes) connected by conceptual relations (links). However, these nodes carry an ever 
changing activation, like a electrical tension reflecting the esteem the program currently 
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Change letter-category of rightmost letter to successor 




Figure 2: Analogy-making in Metacat. 



has toward the relevance of the concept. When the program deems the concept is rele- 
vant (when there is an instance of this concept in the given problem) , it supplies its node 
with a jolt of activation. If the activation passes a certain threshold, the concept even 
has a probability to jump to full activation, as the program focuses on it. Over time, 
this activation decreases gradually, as concepts slowly fade away if the program doesn't 
show renewed interest in the particular concept. Also, some concepts are more directly 
perceivable than others, which is reflected by their depth-value, shallow concepts being 
perceived immediately while for deep concepts this is much harder. It takes a while for 
these deep concepts to get a lot of activation, but they also lose their activation much 
more slowly than shallow ones do. When you discover (notice the high activation of) a 
deep concept, it seems important, containing the essence of the problem, and you want 
to focus on it for a while. 

Correspondingly links have an ever changing conceptual distance, reflecting the pro- 
gram's current esteem of the closeness of the two concepts connected by it. There are a 
limited number of link types, and each type has a concept describing it (a label), for exam- 
ple identity or opposite. These labels are treated as any other concept, but they have an 
extra influence on the conceptual distance of their link type. When a label becomes highly 
activated, their links shrink, as the current conceptual distance between the two concepts 
decreases, and vice versa. When opposite becomes activated, all opposite concepts in the 
Slipnet get 'closer' to each other, since you want diametrical slippages to occur easier 
while the program focuses on oppositeness. Now, imagine what happens if these links are 
to conduct activation, at a rate inversely proportional to their conceptual distance (you 
could see this distance as the electrical resistance of the link). We'll illustrate what would 
happen in Copycat's Slipnet, depicted in figure 3 (the link lengths in this figure have no 
meaning). 

In the beginning, there only is an activation- free Slipnet. Then some letter- strings are 
given to the program. The letters in this string are regarded as instances of the letter 
concepts in the Slipnet. Because of this relation, the nodes of these letters become highly 
activated. These concepts are shallow, since they are easily perceivable, and their activa- 
tion decays quite fast. However, they also can spread some activation through their links 
(for example, from 'a' to 'first'). The letter instances also get related to their position 
in the string: the first letter gets linked to 'leftmost', the middle letter to 'middle' and 
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the last one to 'rightmost'. These relations with instances are not shown in the figure 
since they are not a part of the Slipnct (you could imagine them to be perpendicular 
to the figure's plane). Because of this, these three concepts also receive high activation 
(they're relevant), and they can spread it further to 'left', 'right' and 'Direction'. These 
last concepts are deeper, and typically receive little activation at a time (due to their 
distance), but they also decay slower. After a while, activation has spread toward all (at 
first sight) relevant concepts. Also, when some successive letters are given, the concept 
'successor' will become highly activated (you'll soon know how), and successor-links will 
shrink considerably, facilitation flow through these links, making some successive letters 
seem relevant, and forcing the program to focus on successorship. The Slipnct thus shapes 
itself to the problem. Alas, without more architecture, the Slipnet will quickly lose all 
its activation by decay, as activation spreading in itself will quickly capitulate for the 
relentless decay process. 

Like most semantic networks, Copycat's Slipnet has different 'classes' of links, which 
are sometimes used to focus on a certain relationship. Copycat distinguishes Cate- 
gory/Instance links (in fact two links in opposite directions), Property links, Slip links 
(between concepts that can slip into each other (replacing the other) , they're indicated in 
thicker lines in the figure) and lateral links (for non-hierarchical semantic relationships). 
Through it all, they all behave like normal Slipnct links, spreading activation. 

KgCation-reCated concepts (Basic concepts (Basic concept-related concepts 




Figure 3: Copycat's Slipnet. 
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2.2.2 The Workspace - a construction site for perceptual structures 

The Workspace corresponds to the blackboard in certain programs. It can be envisaged 
as a construction site on which an architecture contest is held. A large number of inde- 
pendent builder teams are given the same set of building blocks (the basic concepts of a 
problem, e.g. letters), and are challenged to use these to build the most elegant struc- 
tures, starting at different times, but working completely in parallel. It's even so that 
when some structures are built by other teams at the time a new teams enters the site, it 
may use these structures in building a bigger one. There is however a catch, since in the 
end, all structures should together form a coherent whole, and should be well integrated 
in the surroundings (the problem's context). Thus, when two teams independently build 
two elegant, but clashing structures, eventually one of the two must give up and tear the 
structure down. Typically, the structure which is best integrated into the surroundings 
survives. Slowly, a consensus emerges, although sometimes an ingenious team can decide 
to go against the flow, constructing another elegant structure in parallel. When that 
structure is sufficiently convincing, it could result in surviving and tearing the bigger, es- 
tablished structure down. To give a feeling where this leads us, imagine the view of a city 
recorded over decades, but played back in a minute or so. At some point a building arises 
in a new style. If it is appreciated, soon new buildings in the new style appear, trans- 
forming the view of the city. The difference here is that, if at some point in time the city 
is built entirely in a single style, the contest ends, and the solution to the problem is found. 

This translates to the notion of viewpoints. First, the Workspace is filled with lonesome 
concepts. As we mentioned earlier, these are instances of the concepts in the Slipnet, the 
latter being Platonic, unique, indestructible and not binding with each other. Early work- 
ers will link these instances up with their Platonic counterparts, by attaching descriptions 
to them. When more workers arrive, they first survey the scene. They have a tendency to 
start working on the most salient instances. The salience value of an instance is a total 
of the number of descriptions it has, the activation of the Slipnet nodes it is connected 
to via these descriptions, and its unhappiness. A concept is unhappy when it is hardly 
used in the existing structures, or is badly integrated. Thus, it cries out for attention. 
Eventually, the workers bind some instances into the first conceptual structures. What 
these structures look like, depends entirely on the skills of the workers, and is as such very 
problem-dependent. We'll discuss these workers and their abilities and behavior in the 
next section. Descriptions are an exception here, they are in fact a fundamental part of 
the architecture, as will be discussed right away. However, once a structure is built, it gets 
a strength. The strength is influenced, first of all by the structure's own properties (e.g. 
the depth of the concepts used), but also depends on other structures already built (e.g. 
how well it fits in), and is evaluated by other workers. The stronger a structure, the easier 
it can beat its rivals, if necessary. Some structures can in turn be used as parts of other 
structures, and can receive their own descriptions. As more and more structures arise 
and the Workspace becomes more complex, automatically a drive toward consistency and 
the use of deep concepts arises, as 'crazy' structures won't survive long. The Workspace 
now created an (imaginary) viewpoint, and will try to perfect this viewpoint until it is 
supreme (a solution is found). However, some alternative viewpoints are always present 
in the background, and sometimes they can gain enough power to topple the established 
viewpoint. This often yields very unexpected new insights. 

Time for an example, don't you think? Figure 4 depicts the workspace of Metacat 
during an actual run. The building blocks here are, of course, letters. A variety of 
structures has been constructed: 

Bonds are not drawn here, but they are the basis for groups. Two letters or groups in 
the same string can be bonded by a sameness, successor, or predecessor relation. 
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1 1most=>lmosi ^rinost=>riuost 
group=>letter Ietrci^>gronp 

Figure 4: Structures emerging in the Workspace. 



Groups bind two or more concepts with a common bond. Sameness groups (boxes) " a- 
a" and "k-k" arose, and also two successor groups appeared, "i-j", and "i-j-k". The 
latter clashes with "k-k" and will probably not survive, since 'identity' is a deeper 
concept than 'successor' in the Slipnet. Still, these two could coexist for quite a 
while. It is drawn with a dashed line because it's still being evaluated, as we'll 
discuss shortly. 

Bridges (correspondences) bind corresponding letters or groups in different strings. 
They can stand for an identity mapping (sameness — > sameness, left — > left), or a 
conceptual slippage (letter — > group, left — > right). The strength of a bridge depends 
on the ease of the slippage, the number of identity mappings underlying it, and 
how it relates to other bridges. A deep slippage (left — > right) is less likely to 
occur, but also creates a strong bridge once conceived. There is for example one 
dashed bridge which wants to achieve a group — > letter slippage, but the "A-A"- 
bridge will probably be stronger thanks to the identity mapping of two groups, both 
deeper concepts. Likewise, a vertical "A-K" -bridge is being considered. This one 
is sworn to become a very strong bridge, because it incurs (leftmost — > rightmost), 
and 'opposite' is a very deep concept. However, it could take some time before this 
bridge is actually built, because the program is not yet focusing on the hard to reach 
'opposite '-concept . 

Descriptions are attached to instances (and structures) to describe what's known about 
them. They set up a connection between instances in the Workspace and concepts in 
the Slipnet. A description consists of a description type and a descriptor (e.g. " object 
category: letter" and "letter category: m"), and both must be Slipnet concepts. As 
we shall sec, an instance is hereby bound to the fate of those concepts. If the concepts 
are highly activated, the instance or structure will be deemed important, and vice 
versa. Figure 5 shows the descriptions (only the descriptors are shown) attached 
to a mature letter-string with many structures built in. This way of representing 
descriptions actually comes from a Copycat implementation by Bolland 0, and 
although it is limited to problems where groups can only be formed between adjacent 
elements (like in Copycat), it's very instructive. In the column under each instance 
are its attached descriptions and the rows spanning multiple instances show reified 
structures existing between the spanned instances. With the two instances of the 
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Figure 5: Descriptions in a letter string. 



letter V for example, a group has been built. Now, further descriptions can be 
attached to this group, such as group category: sameness group, length: 2 and 
string position: middle. As we have mentioned before, the most basic descriptions 
(here the first three rows), are attached in advance. The other descriptions will be 
attached by the program. 

Rules are structures made to articulate the nature of bridges, such as [rightmost — > 
rightmost, letter — > group] for the "c-K" bridge. They are the actual 'output' of he 
program, defining the analogy 



2.2.3 The Coderack - a probabilistic ant colony 

If we would be facing a classical program, the Coderack would be the locus of its set of 
control algorithms. Rather, it doesn't really impose control, and it hardly is just a set of 
algorithms. Then, what is it? I already compared the behavior of the architecture with 
an ant column, we'll now see why. These 'ants' here arc 'codelets', small pieces of code 
doing some small actions (related to building structures) in the workspace. They're so 
small and so large in number that it wouldn't really hurt if some codelet wasn't executed, 
or got terribly delayed. Also, they live in a hierarchical society. There are some basic 
kinds of codelets, and each kind normally has a specialist for each structure, although you 
could easily invent new breeds as well: 

Scouts just wander around in the Workspace and examine instances or structures that 
are in their interest. If it thinks there's a nice structure to be built with them, the 
only real action they can do is to propose a certain structure and to call in other 
codelets to take it from there. 

Evaluators come and look for the type of object they were called in for. They are 
specialized in estimating the promise of a certain structure. Will it be strong? Will 
it fit in with the other structures? When it deems the structure interesting enough, 
it calls in another codelet to do the actual work (or, if you like, it leaves a trail of 
pheromones for the big boys in the column to follow) . 

Builders then come and reify the structure they're specialized in. 

Special codelets will do some other specific tasks. Breakers, for example, will come 
and tear some (weak) structures down when the Workspace doesn't really seem to 
go anywhere. 

For example, when a description scout runs into an instance that could be linked up 
to a Slipnet concept, it calls a description evaluator to assess the quality of this possible 
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description. When it appears to be fruitful, a description builder will eventually come 
and tie up the deal. 

This scheme of exploring, evaluating, and finally building is called the parallel ter- 
raced scan. It originates (like many concepts discussed here) from Hofstadter's Jumbo 
program [8j ch.2] (see also section 4.3.2), an important predecessor of Copycat. Imagine 
falling in love. You don't just see another human being, marry him or her, and file for 
divorce if it doesn't turn out the way you wanted. You can easily rule out a lot of possibil- 
ities just checking them out superficially. Occasionally, a spark shoots out. Probably, this 
happens a lot, possibly with more than one person at the same time. Then, by getting 
to know these people better, a flash can occur, although mostly just with a very small 
number of people. This doesn't necessarily destroy the spark you had with other people, 
you may still like them, but not " the same way" . As you get to know more and more 
about the most interesting people, perhaps flirting a little, you can get more and more 
attracted, and eventually you can fall head over heels in love. Then, the time is ripe for 
the happy couple to solemnize this bond (or another kind of structure). This natural 
scheme meets the real-time pressures any human is subject to [U p. 107]: 
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Figure 6: Copycat's Coderack, early during a run. [12 



This is the idea of the terraced scan: a parallel investigation of many possibil- 
ities to different levels of depth, quickly throwing out bad ones and homing in 
rapidly and accurately on good ones. 

In computational terms, you can use a lot of computationally cheap scouts to check 
out which structures are even remotely interesting. Then, hopefully having excluded an 
awful lot of possibilities, the more expensive evaluators can come to size up the situation, 
leaving just a few structures behind for actual building. Building inappropriate structures 
can be very expensive, resulting in a lot of wasted time. 

Chopping up the construction of a structure in different phases also enables us to 
virtually do all this processing in parallel, by interleaving these small pieces of code. Two 
structures can thus be built 'in parallel', and they can seriously influence each other if 
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they have common interests. Mostly the stronger structure gets built, making the other 
impossible, but that's not always so. Sometimes they can even coexist for a while. In a 
parallel environment, a process that is very important should be executed faster than its 
less interesting kin. We do this by giving each codelet an urgency value. Scouts are of 
little importance, so they are given a low urgency. However, this scout can stumble upon 
something very interesting and create some other codelets with a higher urgency. These 
in turn might create some other codelets, either with a high or a low urgency, according 
to the promise of the structure they're examining. But, doesn't that lead you back to 
a basically serial system, with a lot of high urgency codelets, forcing the less important 
scout to wait for a long time, thus destroying the systems parallelism? Indeed, that's 
why codelets are placed in the Coderack, and the Coderack decides probabilistically which 
codelet may run next. The high urgency codelets thus still have a large probability to be 
unleashed soon, but every now and then, an unimportant scout can leave as well. Also, 
codelets who are neglected for a long time gradually get an increased urgency. Now, say 
you have 79 codelets with urgency 1, and one codelet with urgency 21, then the high 
urgency codelet still only has a chance of 1/5 to be chosen. Doing all this, the Coderack 
can quickly adjust to new needs of the system. Evaluator codelets will assign high urgency 
values to codelets that will further the current viewpoint of a workspace. However, if a 
viewpoint gets unstable, on the verge of falling for a new viewpoint, the parallel process- 
ing will 'fairly' judge which one survives, and adapt itself quickly to the new ruling force. 
Figure 6 shows Copycat's Coderack during a run. 

There are three ways in which codelets enter the Coderack. We've seen one already, 
being these follow-up codelets, created by other codelets in order to work further on their 
findings. Since scouts disappear when they have fulfilled their duty, the Coderack must 
replenish itself with new scouts, to keep the parallel exploration for new possibilities going. 
These scouts are bottom-up codelets, since they are independent of the given problem and 
control. In contrast, there are also top-down codelets, as shall be discussed in the following 
section. 

2.3 How these pieces fit together 

Like the success of any complex adaptive system (economies, ecologies, swarms,...) de- 
pends on the way its different components communicate, our architecture will only be 
useful when appropriate communication mechanisms are provided between these three 
main components. Another system in which organization and communication are of ut- 
most importance, are ant colonies. We will use these as an illustration, showing why these 
mechanisms are sensible. 

Imagine a mad scientist who has bred three ant colonies. Colony A has ants that don't 
leave pheromones, colony B has ants leaving and following pheromones, but not leaving 
more pheromones while they are following another ant's trail, and colony C has excep- 
tionally shy ants not leaving the nest at all. His following field studies revealed a dreary 
outcome. Colony A had ants that wandered aimlessly around, occasionally bumping into 
a food supply, bringing home a tiny bit of it, but unable to retrace it, or to communicate 
it to others. Soon they starved. Colony B lasted a bit longer, since now ants followed 
each other to the same food supply, but only until the pheromone trail evaporated and the 
rest of the supply was not harvested further, unless accidentally found again. The colony 
perished due to a lack of efficiency. The last colony blossomed temporarily under an evil 
queen enforcing cannibalism, but it didn't last long. The moral of the story is that there 
must be some feedback loop between distinguishing good trails and continually testing 
their quality, between perceptual activity [Workspace] and conceptual activity [Slipnet], 
as depicted in figure 7. 
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Figure 7: A feedback loop between perceptual and conceptual activity. 



When a discovery is done [food is found] in the Workspace, a description is attached 
on [pheromones are left toward] the instances or structures playing a role in it. This 
results in a substantial jolt of activation being sent to the corresponding Slipnet nodes. 
You can imagine the activation going through the description links (the dashed lines in 
the figure). This 'discovery' can be any kind of change, like linking up an undescribed 
instance or constructing a new structure of any kind. The activation then spreads to 
related nodes [somewhat like pheromone interplay], maybe also changing the conceptual 
distances. The Slipnet thus adapts itself to new perceptions [trails], putting these to the 
test in order to see if they can stand up against others. It does so by creating related 
top-down codelets. These are special codelets, trained to investigate a specific concept, 
and are given an urgency proportional to the sending node's activation. Suppose a struc- 
ture in the Workspace was made that led to a high activation of the 'sameness-group' 
node, then the Slipnet could dump some 'sameness-exploring scouts' into the Coderack, 
to sec if more sameness structures can be found. If so, the 'sameness-group' node will 
get even more activation, and will further try to steer the perception toward a viewpoint 
focusing on it. Top-down codelets may seem quite powerful, but they still are far to weak 
to lead directly to a dominance of its master node. Instead, by the commingling of a lot of 
top-down codelets, the interests of their masters get blended, like different small pressures 
resulting in a general large one. Still, any discovery, if interesting enough, can launch the 
perception down a certain direction, like pheromones attract ants toward a certain trail. 
This may lead to a new powerful viewpoint [a large food supply], or may 'evaporate' soon 
if it isn't interesting enough, drifting the perception toward another direction. 

As you already may have noticed, colony A lacked the "Workspace — > Slipnet" link, and 
Slipnet activity dwindled rapidly to zero. Colony B missed the "Slipnet — ► Coderack" 
link, not being able to keep the focus on interesting pathways. Finally, colony C was 
born without the very substantial link between Coderack and Workspace, resulting in 
no perceptual activity at all. Cutting any of these links thus obstructs the propagation 
of rational pressures needed to guide the perceptual activity, resulting in a very chaotic 
Workspace behavior. 
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2.4 Meanwhile, on a higher level... 

We could carry our analogy just a tiny bit further. We could think up a colony where all 
the ants stick together when they are doing something. The flock wanders around until 
some food is found, then they all work together bringing it home, and then they start 
over again from scratch, hopefully in another direction. Indeed, this would be a colony 
without parallel terraced scan. Evolution however showed that for ant-like systems, con- 
tinuous parallel exploration, combined with continuously updated communication is the 
most efficient way. Holland [5J HZ] has dubbed this idea " the balance between exploration 
and exploitation": pathways should be exploited at a rate intensity related to their esti- 
mated promise, which is continually updated. At all times, however, searching for new 
possibilities should continue. 

This idea is important in all complex adaptive systems, such as our brain. Without 
continuous rational pressure and the (mostly unconscious) presence of all sorts of small 
ideas and viewpoints in the background, our reasoning process would be unable to pick 
up speed. There is as such a very important interplay between bottom-up and top-down 
processes. Mitchell [17] explains how these commingle: 

When a person has little information about the situation facing it, the ex- 
ploration of possibilities starts out being very random, highly parallel (many 
possibilities being considered at once) and "bottom-up": there is no pressure to 
explore any particular possibility more strongly than any other. As more and 
more information is obtained, exploration gradually becomes more focused (in- 
creasing resources are concentrated on a smaller number of possibilities), less 
random, and more "top-down" : possibilities that have already been identified 
as promising are exploited. 

In our architecture, this strategy emerges from the interaction just described. In the 
beginning, the Coderack generates only "bottom-up" scouts and structure-building in the 
Workspace is highly random. As some basic structures arise, they get communicated to 
the Slipnet, which will evoke pressures to consider this particular viewpoint-to-be. If the 
viewpoint is able to enforce itself, resulting in a lot of new structures consistent with it, 
the Slipnet gets more 'convinced' the viewpoint is important and will investigate it even 
further. As the Slipnet takes shape, and its activation forms a certain "pattern", with 
a number of nodes highly activated and somewhat stable, the " top-down" -codelets sent 
to the Coderack also tend to form a certain pattern. The Coderack only gets a limited 
number of high urgency codelets, thus turning more and more toward a serial system. 
The codelets it sends out will mostly enforce the ruling viewpoint until it is perfected. 

It is important to see how things as 'pressures' and 'viewpoints' gradually emerge. 
Just like a structure gets stronger and stronger in the Workspace, until it reaches a 
threshold and actually gets built, also a viewpoint (consisting of structures) gets stronger 
and stronger in the Slipnet, until it also reaches some vague threshold and is 'taken for 
granted', at least temporarily. In the same way, pressures are only the consequence of a 
lot of parallel interactions. There are both perceptual pressures (attempting to establish 
a certain viewpoint), and conceptual pressures (attempting to realize instances of con- 
cepts into this viewpoint). All these are consequences of the play of different urgencies of 
codelets, concept activation, conceptual distances, strengths and saliences. For example, 
top-down codelets clearly act like proxies for pressures, all to different degrees. All pat- 
terns arising are thus completely emergent, and there is absolutely no large-scale action 
'planned' in advance. Hofstadter j8] p. 225] exemplifies this with a basketball game: 

Any move is simultaneously responding to a complex constellation of pressures 
on the floor as well as slightly altering the constellation of pressures on the 
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floor. (...) Although the crowd is mostly concerned with the sequence of players 
who have the ball, and thus tend to see it as a localized, serial process unfolding, 
the players who never have the ball nonetheless play pivotal roles, in that they 
mold the globally-felt pressures that control both teams' actions at all moments. 
A tiny feint of the head or lunge to one side alters the probabilities of all sorts 
of events happening on the court, both near and far. (...) Each event consisted 
of distributed, swiftly shifting pressures pushing for certain types of plays and 
against others... 

2.5 Incorporating self-watching 

While solving a problem, you can have a feeling you are close to a possible solution, and 
you start to think differently, more focused and more serially, in order to narrow your 
search toward a very particular viewpoint. On the other hand, when you think you're off 
target, you'll probably try to be more open-minded, thinking more in parallel, hoping to 
find a new interesting viewpoint. This requires a basic kind of self- watching. Also, one very 
obvious problem this architecture has is that the same inherently flawed viewpoint may 
be chosen over and over again, whereas you would quickly notice this and change course. 
Maybe you'll also try to use past experience, and you're able to say something like "Hey, 
this problem looks just like that other one!" We shall now introduce additional architecture 
to incorporate these mechanisms of self-watching, albeit in a very coarse way. Still, this 
will greatly improve the architecture's performance, as has been shown by the Metacat- 
projectjm, which implemented the following architectural extensions (temperature was 
already present in Copycat, Metacat just extended it): 

The Temperature is a (reversed) measure of confidence in the current viewpoint. It is 
influenced by the patterns and relationship in its perceptual data, such as structures 
in the Workspace and how well these fit together, but also by patterns occurring 
it its own processing of this data, for example, which themes are used. Think of it 
as a construction project manager, having authority over a large group of workers. 
Each worker works independently, and after the job is done, he or she reports to the 
manager. When enough progress is being made (like building elegant structures), 
the manager will increasingly encourage other workers to continue the work in this 
fashion. However, when progress stalls, he will tell workers to tear some of the 
weaker structures down and try something else. When codelets are building strong 
structures (fitting into the surroundings), they will lower the temperature. If the 
temperature rises due to slow progress, this will send in a lot of breaker codelets, 
tearing structures down. On the other hand, structures will be evaluated more 
carefully by codelets if the temperature is low, because we shouldn't be working 
at random when we're getting close. It thus regulates the open-mindedness of the 
program, being very open-minded when it's high, but very narrow-minded when 
it's low. It also plays an important role in determining whether the program should 
stop when finding a possible analogy, or if it should continue looking for a better one. 

The name 'temperature' originates from the Jumbo-project. The metaphor used was 
that of enzymes active in a cell. Certain genes [nodes] in the DNA [Slipnet] some- 
times created certain enzymes [codelets] , which acted on small molecules [instances] 
to build macro-molecules [structures] in the cytoplasm [Workspace] . The workspace 
was effectively called 'cytoplasm', and the Coderack and codelets were used, but the 
Slipnet wasn't yet introduced. The temperature was guiding the processing: high 
temperature made organic bounds less stable, quickly breaking down, while a low 
temperature meant that structures were more stable, like in a strong viewpoint, and 
bigger structures could be build. 
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The Themespace is home to themes: pairs of Slipnct concepts, representing meta- 
structures in the Workspace. For example, two different bridges may get a theme, 
like [Alphabetic-Position - opposite]. They are thus somewhat like a shorthand 
description of the Workspace's viewpoint. The concepts in a theme are parallel 
copies of Slipnet nodes, and they act the same way. They also have an activation 
level (not equal to the concept's activation in the Slipnet), representing the extend 
to which they are present in the Workspace. Their activation also decays and gets 
influenced by other themes' activation, an they can exert top-down pressure on the 
perceptual activity. A special feature is that their activation can also be negative, 
discouraging the creation of structures compatible to it. 

The Episodic Memory is used for remembering problem-solving experiences over time. 
Every answer found gets an answer description, an archive containing all the infor- 
mation relevant to this answer. This includes the answer itself, all the Workspace 
structures directly involved in creating the answer, all the theme patterns, and all 
the rules found. Since themes offer such a great insight in the previous answer's 
processing, they are also used as a kind of index to this information. This way, 
Metacat can quickly compare answers and strategies used in previous runnings. 

The Temporal Trace is another kind of short-term memory, storing information about 
events occurring during a run, if they have an importance passing a certain threshold 
level. These include high activation fluctuations of deep concepts, the creation of big 
or strong groups, deep slippages, important rules, answers, snags (getting trapped 
in a viewpoint and deliberately toppling it), and clamping of theme-patterns (when 
a certain viewpoint seems very important, and you don't want the themes to change 
anymore). They thus offer a high-level view of what the program is doing and 
influences its behavior, for example, to avoid making the same mistakes. 




Figure 8: Architectural extensions in Metacat. [T^] 
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3 Merits and Drawbacks 



All things well considered, ants [codelets] and scattered crumbs of food [concepts] alone 
won't win Nobel prizes (although we must grant ant colonies at least some kind of intel- 
ligence), and likewise it takes quite a lot of preparation before our architecture actually 
can produce something interesting. You have to enter all the concepts in the Slipnet, set 
their depth values and decay rate, make linkages, connect labels to them and so forth. 
You also have to write the codelets, specify what they must do, what they must look 
for and how to judge a certain structure in the Workspace. This all seems an awful lot 
of work, all for the sake of incorporating perception. This immediately raises a host of 
questions. Is it really worth it? What is it that you actually mean with perception, and 
how does the architecture's own perception making help it in conceiving its goals? And, 
how can fluid concepts and parallel terraced scanning help in doing perception? Other, 
more practical questions also arise. For example, how important are all these parameters 
we have to give in, and how dependent is the system on their exact values? How will its 
behavior be altered by bad conceptual depths or distracting extra nodes or linkages? Can 
it easily handle larger domains than the one just described, and if so, what can it be used 
for? These are the questions challenged in this section. 

I'll first try to give a feeling of what perception is about and why one wants to incorpo- 
rate it. Then a comparison will be made with more traditional systems to pinpoint exactly 
where perception becomes important and how fluid concept architecture incorporates it 
(showing its advantages). Afterwards, I'll discuss the system's parameter dependency, 
noise-resistance and applicability. 

3.1 Incorporating perception: Why? 

Whereas perception is an essential part of human processing of incoming or stored data, it 
is often neglected (or rather, passed on to humans) in many subfields of AI. Can it really 
be neglected in high-level processes of thought without loss? Can it be added later on? 
What are the real benefits resulting from doing perception? And how does fluid concepts 
architecture actually incorporate it? 

3.1.1 Levels of perception 

In order to get a general feeling of what perception is all about, let's try to descry what 
happens if you look at this very sentence. Let's start with a letter. Computer vision 
theories state that first, a filtering takes place to retain only the data that carries much 
information, such as edges. When we know which elements might be important, we want 
to impose some kind of order onto them. A natural impulse is to group them based on some 
common basis. Equalness is obviously very interesting. In vision, closeness and continuity 
are also tremendously important, as is fancying the 'simplest' possibility. These criteria 
are of course subtler than mentioned here, but we must get on with our story. Take a look 
at figure 9, where we've arrived at some premature form of lines, arcs, junctions and so 
on. So far, all processing was 'bottom-up', but now we're entering the realm where this 
information blends with stored information, 'awakened' by perceived similar structures. 
Now an intrinsic play between bottom-up and top-down influences starts. You want to 
get a grip on these structures, know what they're about, find their 'role' in the context. 
For example, a structure can serve as a 'loop' in the letter 'b', or maybe it is itself a letter 
'o'. We are now crossing the blurry frontier between low-level and high-level perception. 
Chalmers, French and Hofstadter |3_ (also [8, ch.4]) explain: 

High-level perception begins at that level of processing where concepts begin to 
play an important role. (...) The distinguishing mark of high-level perception 
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Figure 9: Different ways of looking at a perceived letter 'b'. |15j 



is that it is semantic: it involves drawing meaning out of situations. The more 
semantic the processing involved, the greater the role played by concepts in this 
processing, and thus the greater the scope for top-down influences. 

In our architecture, these top-down influences come from the Slipnet, which therefore 
must at least contain some possible 'roles' [descriptions] that can be attached to the 
perceived structures as something to hold on to. Otherwise, in this case of perceiving 
letters, it would be like you've never been taught that letters consist of loops and straight 
and curved lines, and a loop would always be just a loop to you, never 'the belly of a letter 
b'. When the observed structures [letter-parts] are given basic descriptions (starting in 
the very beginning from the pre-attached descriptions) , they can send activation into the 
Slipnet, which will spread to other concepts, and of course, also to possible further 'roles'. 
McGraw [15] tells how the story continues: 

The various light activations sum up to a total activation-level for each role. 
Any role with sufficient activation will attempt to "mate" with the specific part 
whose labels activated it (here is where syntax and semantics meet). Roles 
compete with each other for a given part. Usually the role that best matches a 
part will win a fight, but once again, role-matching is probabilistic. A comple- 
mentary way of describing the process is to say a given part competes for the 
attention of various roles. 

So now, a part of a letter has gotten a certain role, for example 'loop' or 'left post' (or, 
why not, 'belly'). As different roles will fight, and structures between the parts (the black 
dots in figure 9) are build by codelets in the Workspace, gradually it will become (prob- 
abilistically) clear which letter is being perceived. This is by no means self-explanatory, 
as you can see in figure 9, where the third set of parts also looks pretty hard like 'lo'... 
Likewise, try to read a text without spaces. You might be easily amused by it, your 
perception-making won't be (probably giving you a headache). As you probably saw 
coming from a mile away, there actually exists a project, the name is Letter Spirit, that 
tries to recognize letters, and that uses (largely) a fluid concept architecture. However, 
it still doesn't perform low-level perception, since the letters must be entered as lines 
connecting points of a grid. Yet, its perception making goes a long way. Once it has rec- 
ognized a letter, it can use this experience to draw the other letters of the alphabet in the 
same 'style'. More on Letter Spirit in Hofstadter [8j ch.10], McGraw [14] & Rehling [20]. 
Figure 10 shows its processing. On top is the Slipnet (the bigger the square, the higher 
the activation), below are the Workspace (left) and Coderack (right). 

Notice that high-level perception still has a lot in common with its lower-level brother. 
It still thrives on isolating the most important elements (those whose roles have the highest 
activations) and trying to bind them in a largely bottom-up manner, into some kind of 
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structure, supposing some plausible relations between them. The difference, however, 
is that in high-level perception, top-down pressures are ubiquitous, highly altering the 
course of events... 

3.1.2 High-level perception and representation building 

Although low-level perception has a profound influence on human behavior and probably 
also on the way we attach meaning to the world, the reason it is not included in this 
architecture, is that (besides of being extremely difficult) it is assumed to have little im- 
pact on high-level processing. In this paper, we shall adopt this assumption, but future 
research should not take this for granted... But then, you might ask, isn't it just as reason- 
able to assume we can study intellectual processes independently from any perception at 
all, starting from a fixed representation? What makes high-level perception so interesting? 

Well, the problem is that the pressures that have led to the perception, performed by 
the programmer, are lost when it is cast into a simple fixed representation. In our letter- 
example, the program doesn't stop once it has found a certain interpretation for the 
perceived structures. The system is just starting to feel enthusiastic. The Slipnet nodes 
are activated in a sensible way, interesting structures have been built and the codelets 
[workers] are really in to building more structures in the predominant style. These are 
valuable pressures that can guide the program in performing even bigger tasks, like when 
Letter Spirit perceives a curl in a specific part of the source letter and uses this to draw the 
other letters of the alphabet in the same style, with the same curl on the 'corresponding' 
place (if possible). In Copycat, the discovery of a structure, let's say a successor bond 
'abc', will inspire the workers to find even more successor bonds (by top-down pressures). 
Even bridge-building workers will prefer building bridges between successor groups, since 
it makes perfect sense in the current perception (why try building other bridges, if there 
is no pretext?). The commingling of all these pressures launches the whole system into a 
certain direction, completing the analogy (or another task) in a sensible way. 
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"Concepts without percepts are empty, percepts without concepts are blind", Immanucl 
Kant once stated eloquently. This became a key point in the philosophy behind this 
architecture: the perception of a problem, and solving high-level tasks with the resulting 
interpretations cannot be separated without loss. The two are heavily intertwined, and 
together, they form a feedback system. Perception-making provides pressures for the 
high-level task, and the high-level task processing adjusts the perceptual process. If you 
make a bad first perception, high-level processing will get stuck, and the places where the 
interpretations don't fit in (its weakest structures) will tell where the perception should be 
revisited in order to improve this (a principle dubbed "the squeaky wheel gets the oil" by 
Hofstadter). Perception is thus always guided by a (top-down) conceptual influence [3]: 

Without this conceptual influence, the representation that results from such 
perception will be rigid, inflexible, and unable to adapt to the problems pro- 
vided in many different contexts. (...) Integrating perceptual processes into 
a cognitive model leads to flexible representations, and flexible representations 
lead to flexible actions. 

Contextual factors, goals, beliefs and their resulting bottom-up as well as top-down 
influences offer a lot of precious information that you should not throw away by using rigid 
(hand-coded) representations, even if sometimes they can be partially regained afterwards 
(with heuristics for example). This would lead to (a) excluding a priori a number of 
possibilities that might be very interesting, because you didn't notice them or considered 
them less important (or worse, just forgot to add them), or (b) doing a full- width search 
through all concepts and possible relationships that even remotely apply to the situation, 
which in real life leads to a combinatory explosion. Mitchell |17j wittily exemplifies this: 

If you hear a funny clacking noise in your engine and then your car won't 
start, you might give equal weight to the possibilities that (a) the timing belt 
has accidentally come off its bearings or (b) the timing belt is old and has 
broken. If for no special reason you give equal weight to the third possibility 
that your next-door neighbor has furtively cut your timing belt, you are a bit 
paranoid. If for no special reason you also give equal weight to the fourth 
possibility that the atoms making up your timing belt have quantum-tunneled 
into a parallel universe, you are a bit of a crackpot. If you continue and 
give equal weight to every other possibility. . . well, you just can't, not with a 
finite brain. But, on the other hand, there is some chance you might be right 
about the malicious neighbor, and the quantum tunneling possibility shouldn't 
be forever excluded from your cognitive capacities or you risk missing a Nobel 
Prize. (...) Counterintuitive possibilities (...) must be potentially available but 
must require significant pressure to be considered (e.g. you've heard complaints 
about your neighbor; you've just installed a quantum tunneling device in your 
car; every other possibility that you have explored has turned out to be wrong). 

When you read a sentence, you don't first abstract all the words before you start at- 
taching meaning to them, the meaning already emerges by the first few words you read, 
and you often don't have to read the whole sentence because pressures emerging already 
suggested what the sentence is going to be about. Unless you got it wrong the first time, 
in which case you have to focus more on the words to change their interpretation until it 
makes sense in the context. Many psychological tests have shown that adults only use a 
minimum of information to make sure the current interpretation is trustworthy enough to 
lean on. Especially in vision, people only look at the most important aspects, and often 
don't realize when a small change happens. That's the trick a magician often uses when 
making things disappear (children, not yet fully trusting their internal representation, do 
still notice a lot more, and that's probably why many magicians don't like performing for 



19 

































abc 


=> abd ; mrrjjj 


=>? 






705 








Total number of runs: 7000 
Average number of codelets per run: 623 






















187 






















48 


42 


6 


2 






mrrkkk 


mrrijk 


mrrjkk 


mrrjjij 


mrrjjd 


mi-rjjj 








Si* 








21° 


62* 


£2° 







Figure 11: Different outcomes on 1000 runs of Metacat. Under each answer, the temperature 
reveals how confident the system was with this answer (the lower the more confident). |12| 



children) . A system could never afford dropping a lot of information without a strong in- 
terplay between new perceptions and a stored representation of the world. Likewise, fluid 
concept architecture also drops information by going with a general pressure, disregard- 
ing all other possible routes, unless the taken route doesn't turn out satisfactory enough, 
and the system changes course. Helped by a little non-determinism in the system, this 
means the system will produce different outcomes while starting with the exact same in- 
formation. Figure 11 shows a summary of the different outcomes on 1000 runs of Metacat. 

This interaction between perception and task-oriented processing also makes the use- 
fulness of a separate "representation module", doing the perceptual processes indepen- 
dently and outputting the desired representations, questionable. Chalmers, French and 
Hofstadter [H p. 176] state: 

...for the accurate modeling of cognition it is necessary that the representation 
of a given situation be able to vary with various contextual and top-down in- 
fluences. This, however, is directly contrary to the "representation module" 
philosophy, wherein representations are produced quite separately from later 
cognitive processes, and then supplied to a "task-processing" module. 

3.1.3 Anthology of interests 

Perception thus sets us on a certain track, and it better be a good one. But how can we tell 
a good train of thought from a bad one? How do we feel we're okay? For one, conceptual 
relevance is very important, as is the way our perception fits the current situation. We 
want to focus on what seem the most relevant aspects of the situation, to find the essence 
hidden within. When a certain structure is built, and the workers assess it fits nicely into 
the surroundings, the structure will get a great strength- value, making its corresponding 
concept very active (relevant), creating top-down pressures to look for more structures 
using this concept, and thus creating a drive towards this assumed essence. If the corre- 
sponding concept is indeed (part of) the essence of the situation, this will lead to a stable 
architectural viewpoint, otherwise the concept will quickly decay again because it is found 
to be irrelevant in the long run. For example, discovering a successorship in a string like 
'abc' and then again in its opposite 'cba', will keep the successorship-concept active, and 
thus likely to make up the essence of the analogy. In Metacat, these would make up very 
active themes, further encouraging the pursuit of this assumed essence. This example also 
shows that an essence can shift as perception shifts to a another level of abstraction. First 
(at letter-level) the successorship is important, but later (at string-level), the system will 
largely focus on the opposition. In principle, multiple essences can exist simultaneously 
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(as can multiple themes), as long as they don't interfere, which is why a healthy com- 
petition between structures is needed. When perception creates inconsistent or clashing 
structures, we get nervous, temperature rises, we start to question our assumptions and 
raise our guard, breaker codelets break down the weaker structures and we become more 
open-minded, open to other trajectories. A certain amount of contradiction is acceptable, 
but we also want to adjust our course in good time. Also, the more relevant an aspect is, 
the less we want it to create contradictions. Self-watching mechanisms are as important 
as they are hard to define, and much depends on the programmer to make this work. 
The bottom line however is that once the system assumes an aspect to be essential, it 
shouldn't ignore it, and a drive should emerge focusing on it. 

Finding an essence can be a tricky thing, since different people live in conceptually 
different worlds. Take, for example, a flower. There's much to know about it, yet different 
people will mainly concentrate on what's the essence of a flower to them, as formed by the 
particular situation they're usually in. To a biologist, by professional bias, a flower is that 
part of a plant that's used for reproduction. Direct connotations are stigma, petal, bee, 
pollen, nectar and so on. A poet will more easily think of things like color, beauty, love, 
fragility... Yet another poet might have mainly sad memories and think about thorns, 
poison and cruel beauty. But on every occasion the essence of a flower changes. If you 
walk in the garden you might only think about its color and its nice scent (and probably 
not about aromatic chains of hydrocarbons), if you are at the florist you're probably 
concentrating on freshness, gift and happiness. Even our sad poet might one day return 
to see flowers as beautiful things. James [10] expressed this (back in 1890) as follows: 

There is no property ABSOLUTELY essential to one thing. The same prop- 
erty which figures as the essence of a thing on one occasion becomes a very 
inessential feature upon another. (...) The essence of a thing is that one of 
its properties (...) is so important for my interest that in comparison with it L 
may neglect the rest. 

A biologists will see a flower's beauty as being vaguely relevant, but hardly essential 
when doing a dissection. It will pop into his mind, but will also be quickly discarded. 
Likewise, successorship of letters may be essential to solving copycat-like analogy puzzles, 
but not when reading a novel. There is thus a subtle distinction between relevance and 
essence. When beginning to perceiving something (like this sentence), we may suppose 
relevance, but its essence is scarcely found before the end, and sometimes we might be 
completely wrong about it. The fact that 'a' is the first letter in the string 'abc' is cer- 
tainly somewhat relevant, but probably not quite essential. It only becomes essential if 
it leads us to a deep understanding (or whatever our goal is). Copycat seems to find 
something essential in a given situation, and is able to transfer it. But how does it know 
where to look? What concepts make it from being relevant to being essential? Suppos- 
edly, they have to remain highly activated (relevant) over a decent period of time. These 
concepts must thus have many (and very strong) perceptual structures and also a large 
depth- value helps, by slowing down the decay process. Also, the temperature has to be 
kept low enough to make sure the program keeps following this viewpoint. Codelets thus 
play a large role in finding the essence, by awarding strength- values, and by regulating 
the temperature. Thus, we might state that in this architecture, we feel save when we 
are focusing on concepts that are very relevant, deep (having a large depth- value) , and 
preferably not inducing clashing perceptions. This seems like a very acceptable rule of 
thumb, but one should remain vigilant while writing codelets and assigning depth-values, 
to make sure that this agrees with one's goal. It is a big plus for the architecture that 
it allows to pursue an essence, even though it has to be set somewhat on the right track 
about it (through codelets and depth- values) . However, it is questionable if this really is 
a firm basis for "essence-finding" or "staying on the right track" in general. 



21 



It the same fashion, it is unclear whether depth-values should be dynamic or not. 
This value should reflect the relevance a concept must gain to become highly activated, 
to bubble up to the surface, and therefore the extra attention it deserves. But in different 
situations, a concept may have different depths. A related project, called Tabletop (by 
Robert French [?]), is about making analogies using objects found on each side of a table. 
A spoon could correspond to a fork if there were two of each kind at opposite sides of the 
table, a cup could correspond to a glass if there wasn't a cup at the other side and so on. 
Playing this game, I asked a friend to make an analogy with my spoon, and he touched 
his plate, since he didn't have a spoon, and both were hollow. Hollowness was a very deep 
concept, very essential to the analogy. Dealing with deep concepts somehow tends to 
strengthen our confidence, and analogies based on deep concepts often seem more intelli- 
gent. But if his side of the table was full of hollow objects, then hollowness would loose its 
depth (it would become more apparent) , and touching a plate wouldn't make much sense, 
since it has no other common characteristics. The abundance of hollow objects somehow 
made the concept hollowness less deep (or at least made this concept's depth less of an 
issue) . When reading a poem, rhyme may be something deep, but not in a language that 
has nothing but rhyme. Are we giving the program some of our experience by setting 
these depth- values, something like a concept's expected importance in the most expected 
situations? Does it create something like a mindset? We may have to refine what we 
mean by this depth- value. 

Being able to autonomously pursue an essence also has some indirect advances, since 
you don't have to rip relevant concepts out of their conceptual network before giving 
them to the program. This would obstruct flexible thinking about concepts, because so 
many links are broken. Sometimes this will pose no problem, but when the flexibility of 
concepts becomes important, for example in humor and creativity, you're basically making 
yourself blind. In creative thinking, many far off links might just bring you to concepts 
that are very interesting, possibly radically shifting the essence of a situation, and that's 
why you want to keep them, even if at first sight, they seem superfluous. Talking about 
far off links, while thinking of a title for this section, it occurred to me that 'anthology', 
translated in Dutch becomes 'bloemlczing', which literally translated means 'reading of 
flowers'. 

3.1.4 Gradually solving the case 

It might be hard to imagine what it takes to build up your own representations, to find 
an essence based on the relevances of different aspects, to attach roles to the different 
things you see. In fact, it is somewhat like solving a murder case (but mostly on a lower 
level). Therefore, imagine you are a detective trying to find a perpetrator amongst a host 
of suspects, starting with attaching initial descriptions (preconceptions) to the people 
involved. Let's say a landlord was found dead in his garden, presumably struck down 
with a shovel. You might suggest the butler is innocent, the gardener is the assassin 
and the maid set up the whole plot, until this theory doesn't make sense anymore when 
these descriptions and the following deductions [workspace structures] start to clash with 
contextual information. For example, the maid's behavior might reveal she was having an 
affair with the landlord, and this creates a pressure leading you to a new representation. 
You may have been right about the butler being an good man, but the discovery that 
the maid had an affair opens up the possibility that some people might be jealous (most 
likely a deep concept in solving crimes), and this top-down influences make it reasonable 
to assume for a while that the butler is a jealous man and has since long fell in love with 
the maid (flexing his representation, changing his essence), to see if that leads to a sound 
solution. For example, that the butler killed the landlord out of jealousy using a shovel to 
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avoid suspicion. Including each and every possibility and scanning them one by one (and 
then again when one fact turned out to be irrelevant) would probably take more than the 
average detective's lifetime. Every possibility should however be readily available, but 
only allowed to enter the arena when there's enough pressure (a hunch, a suspicion, a 
previously insignificantly deemed piece of evidence, a deduction,...) to support it, and it 
must fit nicely into the context to stay supported. This requires that the representation is 
sensitive to bottom-up as well as top-down influences, and that it can easily be flexed at 
the right points, possibly radically reshaping the perception of the situation at the drop 
of a hat. Although fluid concept architecture isn't readily able to solve such big problems, 
it does try to offer a means to create the flexibility one could expect to need for it. 

3.2 The impact of perception: a practical comparison 

To make these abstract ideas a bit more practical, let's compare the way fluid concept ar- 
chitecture incorporates perception with how most traditional systems use the perception 
done and entered by their programmers. Because I need to compare with something, I 
chose production rule systems (with predicates as representation), mainly because they're 
well-known and very different, thus allowing quick recognition and high contrast. This 
is not at all an attack on these systems. I don't have to mention that production rule 
systems and predicate logic are very powerful tools in AI. It's just that for some not-exact 
tasks (mostly the ones easy for humans, but hard for computers), they seem somewhat 
far-fetched and brute, and a more 'natural' way, involving bottom-up and top-down pres- 
sures, might be very interesting. 

As an illustration, I shall apply both ideologies to the "Traveling Salesman Problem" 
(TSP). Neither are particularly good at it (it behaves exponentially and is conceptually 
shallow, thus hard for computers and humans), but it allows to highlight the stages where 
perception comes in and how it is integrated in both methodologies. The following three 
facets reflect the incidence of perception: 

• the representation of a certain concept and the chosen syntax 

• the way you go from one 'interpretation' or 'state' to the next (possible transitions) 

• the way the system decides which of the possible transitions to do next 

3.2.1 Representation and syntax 

Apart from the structure you use to represent your data (predicates, frames, scripts, 
semantic networks,...), there are two important problems in which perception plays a key 
role P p. 173]: 

relevance - you must filter the relevant facts out of a possibly huge array of data 

organization - you must determine which pieces of data are related, and organize them 
in a coherent structure 

In a fluid concept architecture, the program's initial view on the world consists of 
groups of tokens with some pre-attached (very basic) descriptions in a certain configura- 
tion, backed up by general 'knowledge' about concepts in a (small) domain and how to 
construct a representation out of it. With hand-coded representations, the program sees 
a 'rich' world of fixed relations and objects, tailored to a specific problem and written in 
a fixed syntax. The programmer who has entered these representations has thus already 
filtered out the (seemingly) uninteresting data, has decided which relations are 'impor- 
tant' and has put them in the right form. So, in the second case, the perception is largely 
done, while in the first it hasn't even started yet. 
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Figure 12: Initial perception in a TSP. 



In Copycat, a situation is represented by descriptions (the attributes of a given ob- 
ject) and by bonds (the relations between objects). Turn once more to figure 5. Beneath 
each instance are its attached descriptions (attributes) and the rows spanning multiple 
instances show reified bonds (relations) existing between the spanned instances, which in 
turn can have their own representations. It is a snapshot of the current interpretation, 
the organization of the data the program deems useful in the current context. These 
descriptions are attached (and detached) by specialized workers in the Workspace. This 
means these have to know which concepts can serve as part of a description. In Copycat 
they make use of the Category/Instance links of the Slipnet. As you probably remember, 
a description consists of a description type and a descriptor, for example "Letter Category: 
a", and you can find this link in Copycat's Slipnet in figure 3. This way, the faith of the 
instance (or structure) is bound to these concepts, being in the center of attention when 
they are highly activated, and totally neglected when they fade away. That is, as long as 
the description exists, because a worker can deem another description more appropriate 
(in the new viewpoint). When the new description wins the fight, the old one will be 
broken down and replaced by the stronger one. As a result, the representation is highly 
sensitive to bottom-up and top-down influences during a run, and can mold quickly into 
a new interpretation. 



The program decides for itself which concepts in the Slipnet are relevant at each stage 
of the processing. Linking up a description with Slipnet concepts, means a jolt of activa- 
tion is sent to that particular concept, which becomes activated (while irrelevant concepts 
do not). Spreading activation to conceptually close nodes can result in new concepts to be 
deemed important (e.g. letter — > group) and so on. This is the way fluid concept architec- 
tures tackle the relevance problem. The making of bonds and the (temporary) attaching 
of descriptions to instances as well as to structures keeps the structure of representations 



very flexible, as will be discussed in section 3.3 



Figure 12 shows a possible initial state of a TSP. The task is to find the shortest tour 
through the cities San Francisco, Boston, New York, Miami and Dallas, visiting each city 
exactly once. Our human programmer (on the right of the figure) has cleanly picked out 
the important facts, namely the 5 cities (which she translated into objects) and the needed 
links, thus only the ones involving two needed cities (which were translated in the (only) 
way the program could understand). Also, the isolation of each city can be entered, if the 
algorithm used knows how to handle it. The fluid concept architecture is given the cities 
in the problem, represented here as things (lists) with some pre-attached descriptions. Its 
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Slipnet however contains more cities than stated in the problem, since it is meant to cover 
the entire domain (for simplicity, I've only thrown in two more cities and their links to 
other cities in the Slipnet, but in fact the entire domain could be entered). It also has 
nodes for the concepts city and route, and for the position of a city in a route: end (the 
head or a tail in a route) and single (when the city is alone, not part of a route yet). To 
keep the figure surveyable, I omitted the Category links, but they are quite straightfor- 
ward. Attaching descriptions to city instances will activate only those cities which have 
an instance in the Workspace. This way we (initially) only consider relevant cities. The 
numbers on the links represent the distances between the cities. I chose them to be the 
initial conceptual distance. I didn't include any altering of this distance because the orig- 
inal TSP doesn't include it. It could however be extended. For example, one could add 
a label airplane on all the links that represent an airport connection and expresses the 
urge to go by airplane. If the concept airplane is highly activated, the distances between 
cities with an airplane-link will shrink, bringing them more closely together, and thus 
more interesting to visit. Also, the traveler's company could for example want to avoid 
too many airplane tickets, and ask to lower the airplane activity with every trip done by 
airplane, making airplane connections less and less favorable. It would be like airplane 
flights became less and less relevant (because the company looses interest). The way the 
activation of a concept drops is one of the many parameters in the system, and however 
it is normally set to decline exponentially, it shouldn't be a problem to let it sometimes 
be regulated by codelets. It could make an interesting extension. 

The conceptual depths of the concepts (the right box under a city name, the left 
box being its activation) does affect the amount of focus a concept can keep. A deep 
concept loses activation slowly, thus keeping it relevant for a longer period. Here, I first 
of all made a difference on observability. 'City' and 'Route' are harder to observe (more 
abstract) than 'Dallas' and 'New York'. Nevertheless, I didn't choose all the cities to have 
the same depth, since some cities are more difficult to reach than others (less apparent 
in another way). This creates an extra focus on isolated cities. When we are close to an 
isolated city, we want to visit it soon, because visiting nearer towns first would lead us 
further and further away again, and we would have to do long (expensive) trips in the 
end to make up for it. As the architecture creates a drive towards deep concepts, we want 
some drive towards visiting isolated cities. When we visit a city, it gets a lot of activation, 
spreading it to its nearest neighbors, and creating an urge to visit these neighboring towns 
first, as high activation means a high salience value for its instances, and high saturation 
means that workers in the Workspace will be inclined to bind those cities together first 
(which corresponds to traveling between them). An isolated city holds its activation for 
a longer period, keeping it interesting as long as we are visiting cities nearby. 

3.2.2 Transition mechanisms 

A system is often expected to transform an original situation (state) until it meets cer- 
tain conditions. Traditionally, this is done by predefining a number of possible actions or 
rules and trying them one by one. For fluid concept architectures, this is also partially 
true, but what normally would be considered an action is now a chain of agents that 
first assesses the fruitfulncss of the action, and instead of trying all possibilities, only a 
couple of actions are running in parallel. These actions are 'excited' by pressures (high 
Slipnet activations or scouts finding something interesting) in the form of highly urgent 
agents. The distributed nature of actions makes that they can be temporarily or perma- 
nently interrupted at any time. This allows a parallel execution of multiple actions at the 
same time, and a direct impact of changing perceptions and pressures. When one agent 
changes the representation or creates (or destroys) any kind of structure, this can lead to 
a change in the activation pattern in the Slipnet, resulting in concept nodes throwing in 
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Figure 13: Transitions in a TSP. 



new top-down agents with a high urgency- value, overtaking waiting agents and changing 
the situation in such a way that a running chain of agents gets broken because a follow-up 
agent needed a structure or description that doesn't exist anymore when it finally entered 
the Workspace, or that this structure isn't as interesting as it used to be, since the pro- 
gram has just changed its interpretation (and thereby its architectural viewpoint). 

Perception, translated into descriptions, structures, and activation patterns in the Slip- 
net, creates a number of pressures that commingle until a number of (top-down) agents 
is created to carry these pressures further. It's a kind of survival of the fittest, creating 
structures (both perceptional and task-oriented) that become better and better adapted 
to other structures and their surroundings. Without a quick overruling of weak structures 
by stronger ones, the weak ones will have too much influence on the process, guiding it in 
the 'wrong' direction. 

When using hand-coded representations, the perception of the programmer (if done 
right) sets the system directly on the right track. The program has all the pieces on all 
the right places, but is totally unaware of the pressures that drove the programmer to 
this representation and it has to equally consider all possible further actions. It's like a 
game of chess in which a chess player has to play with a configuration she didn't create 
by herself, unaware of the pressures that led to that particular configuration. She would 
be allowed one move, then another (identical) chess player would enter, playing one move, 
and so on. Every time, the player has to equally consider all the possible moves. The 
role a certain piece had (a description) in a strategy that was forming is lost over and 
over again. We'll come back to this example very soon. Perception has a continuous fo- 
cusing nature, creating pressures that actually launch you down a certain path, for good 
or for worse, until the pressures change and you are led down another alley. Eventually, 
the program hopefully gets its perception 'right', guiding you towards a sound solution. 
Without this pressure however, you have to try (all) different possible actions, and see 
where they bring you. Often, this is still the fastest way to go, but when the problem 
you are facing gets really large or when it becomes unclear what to enter in the program, 
pressures arising from perception can become very useful. 

Figure 13 shows what happens in our TSP. The classical approach is to use a structure, 
such as a tree, in which all possible outcomes, as the result of all applied rules, are stored. 
A start location has been chosen (New York), and all possible routes to another city are 
tried. In the fluid concept architecture, different codelet chains are running in parallel. I 
chose 3 scout-cvaluator-buildcr-triplets, one for descriptions, one for bonds, and one for 
routes. Also a breaker-agent is entered virtually, but not yet created by the Coderack. 
A bunch of bond scouts has already been unleashed, and they have proposed a number 
of bonds (in dashed lines). Bond evaluators are called, but still waiting in the Coderack 
(the numbers before the agents are their urgencies). One strong bond has already been 
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built. This has been noticed by route scouts, which have proposed a route and called 
in highly urgent route evaluators and builders, which have already constructed the first 
route. Another description builder is about to change the Single-description of Boston 
into End, as it has become part of a route. This description is needed so that bond scouts 
can tell whether a bond can be proposed (a city without end or single can't be bonded, 
since it would already have 2 bonds). The scouts and evaluators on the field must also 
check for route loops before proposing further development of a bond, because these are 
not allowed here. New York and Boston have become very active because of their inclusion 
in a route, and have spread this activation to their neighbors. This means these neighbors 
are relevant, and should be looked at soon. This way, nearby cities get linked up sooner 
than distant ones, and shorter links get reified sooner. Miami and SanFrancisco have 
already gotten enough activation and have spawned top-down bond scouts to see if their 
city instances can be bonded. Notice that no link between NewYork an Miami has been 
proposed yet. This is just a matter of time. Top-down NewYork and Miami-bond scouts 
are already on their way. On top of that, the Miami-instance will get unhappier if it 
doesn't get bonds soon, and shout out for more attention. The same goes for Dallas 
and SanFrancisco. It's very important that these different pressures can compete with 
each other in a fair way. Remember that this example is completely fictitious (to keep 
it simple), in a real implementation, a lot more codclcts (especially scouts) are to be 
expected in the Coderack. 

3.2.3 Transition control 

There's something missing in our chess-example. The chess-player must indeed equally 
consider every move, but she can use her own perception to assert the value of a certain 
move. So here another incidence of perception enters. A program can use evaluation func- 
tions or heuristics entered by the programmer. In fluid concept architecture, this becomes 
very blurry. A great deal of the information normally offered by heuristics emerges out of 
the play of pressures. For example, a heuristic estimating the value of an action has (to 
a certain extend) a counterpart in the relevance of the concepts involved in that action, 
thus making the action seem preferable or not. But although a heuristic is mostly added 
in advance, being some kind of (artificial) translation of the way we want the program to 
act, these pressures emerge by themselves, and are heavily dependent on the direction the 
program as a whole is taking. Since they are part of the problem, they can adapt quickly 
and are shaped to the particular problem. Heuristic functions are often an abstraction 
of the pressures we want to arise. One big advantage problem-specific heuristics offer is 
that they can often express the 'distance' to the goal, while pressures can only behave 
as an evaluation function, reflecting the overall 'fitness' of the current structures built in 
the light of the problem. Fluid concept architectures thus have a natural drive towards 
fitness, but not so much towards a specific goal. 

This architecture thus needs its designers to write scout- and evaluation codclcts, and 
they appear to be little more than evaluation functions, but then using a parallel terraced 
scan. Whereas normal evaluation functions are quite complex and have to be calculated 
after each action is assumed (temporarily made), many scout codelets first perform a very 
cheap calculation, and only the satisfying actions are proposed to be further evaluated by 
the descriptions codelets. Since the codelets are written in the light of a domain (although 
some codelets may be general enough to be used in many domains), they cannot explicitly 
judge an action with respect to the given problem. Instead, it judges the action's result in 
terms or a strengthening or weakening of the viewpoint in the Workspace, and it assumes 
the viewpoint is right, since it has nothing else to go on. Thus, domain-specific evaluation 
is done by codelets, while problem-specific tendencies originate from emerging pressures. 
The one cannot function properly without the other. 
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Since the architecture doesn't know in advance which structures are going to be the 
essence of the problem, it is of utmost importance that different possibilities are investi- 
gated in parallel, and that the pressures and evaluations can have a very strong impact on 
the further course of events. This view was already advocated in 1977 by David Marr [T31 
p.44]: 

the perception of an event or object must include the simultaneous computation 
of several different descriptions of it, that capture diverse aspects of the use, 
purpose or circumstances of the event or object. 

Whereas many classical approaches are searching through "state-space", trying some 
transformations, asserting them and going further on the most promising results, fluid 
concept architectures are searching through " interpretation space" , asserting different 
interpretation-related structures (descriptions, bonds) in parallel, creating a drive that 
also guides the high-level structures (groups, bridges, rules). Representation building and 
high level processing are thus deeply intertwined. A bad representation (interpretation) 
will be quickly revisited when it doesn't make sense in the context of the problem, and a 
bad high-level structure will be quickly surpassed by structures that are better adjusted 
to the interpretation, ever changing with the pressures emerging from this interpretation. 
This interplay between perception and high-level (semantic) processes can only function 
if both are running in parallel. 

In figure 14, our TSP has matured into a somewhat advanced stage. The classical 
approach has been using a simple heuristic, the nearest neighbor-heuristic, making the 
city that is closest to the head of the route the new head of the route. There are more 
advanced methods (Markov chains, genetic algorithms, simulated annealing, Tabu search, 
neural nets,...), but this little rule actually does pretty well. It reflects the perception that 
linking up two cities that are close to each other is likely to be a step in the right direction. 
However, the problem with this heuristic is that isolated cities are hard to integrate. Our 
fluid concepts architecture has also matured. It now has two routes, which doesn't have 
to be a surprise, since it searches bottom-up (especially in the beginning, when little is 
known) and in parallel. The bond between Miami and Boston has passed the building 
of many other links, and a new route has been formed consisting of Miami and Boston. 
Route scouts have even noticed they can use the bond between Miami and New York to 
bind them into a new route. A route structure has been proposed, and a route evaluator 
with high urgency was added to the Coderack. When the route gets built, there will 
be one large route, and New York and Miami will lose their enrf-description. This way, 
bond scouts will no longer propose bonds to them (at least, they should be programmed 
that way). The figure also shows the Slipnet, now full of activation, focusing on Object 
Category, Route, New York, Miami and Dallas, which are in a state of full activation. 

The cities Seattle and Chicago have also gotten a little bit of activation, but not too 
much, as they are not deemed relevant (the links connecting them to activated cities are 
long). It could however happen that a city, let's say somewhere between New York and 
Miami (e.g. Washington) gets a high activation through activation spreading and starts 
throwing top-down codelets saying "Hey, I'm relevant, maybe you should try a route 
through me!" The codelets will however be unable to find the Washington-instance and 
nothing further will happen. It may sound weird that this is allowed, but when you would 
be planning a route with a map of the USA in your head, you would probably also quickly 
come up with Washington, also immediately dropping the idea since you have no business 
there. That's just the nature of concepts. Also notice the struggle of the Miami-Boston 
bound, that has since long been proposed, but was never deemed interesting enough by 
evaluator agents on the field. Instead, the Miami-New York bond was justly found to 
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Figure 14: Controlling pathways in a TSP. 



be more preferable, and this stronger pressure has won (for now). When the Miami-New 
York route will be built, the Miami-Boston bond will probably disappear quickly since 
it can no longer be supported (Miami can't have more than two bounds). Would Miami 
however lose a bond due to a change of interpretation, bottom-up bond scouts may give 
it a second chance. This interplay of pressures is continuously guiding the system. 

Since the TSP is NP-hard, only an approximate solution is expected from both strate- 
gies. However, due to its indeterministic and emergent nature, fluid concept architecture 
is not (ever) able to guarantee finding the best solution (if it even exists) . The best it can 
do is to give often a good solution. Implementing the TSP in a fluid concept architecture 
would be very interesting to see how it actually performs, but this is way beyond the scope 
of this paper. There exists however an implementation in BAsCET (Parmentier |18j ) (a 
system very much like Copycat, but with some important differences), and some tests on 
the TSP (although with just one codelct) arc performed, showing it does pretty well [TI5] 
and that its complexity is grosso modo linear. 

Still, remember that this problem is mainly chosen to discuss where and how perception 
enters the scene. Fluid concept architectures don't consider this problem their cup of tea. 
They want to discover essences and focus deeply on them, while in this situation, the 
focus is continuously jumping around from one city to another, and perception-making is 
really very shallow (probably making the system 'feel' a bit frustrated, as probably also 
happens with some humans asked to solve it). Nevertheless, even here, perception making 
and searching for relevance can be useful. 

3.3 On the flexibility of representations 

An important property of most AI programs based on predicate logic is that they heavily 
rely on a well defined syntax and an unambiguous representation (although uncertainty 
reasoning is changing this, see section [4]). Fluid concept architecture is (or at least should 
be) more flexible at this. Since it is based on the nature of concepts, involving concep- 
tual similarity and slippage, it can handle ambiguities (to a certain extend) because the 
differential relevance between concepts is well defined (in the Slipnet). Also, descriptions 
carry semantic information (it holds a Category/Instance relation), so the representation 
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doesn't rely on syntax alone to know how to interpret a certain concept. Let's try to 
make this more clear. In predicate form, we could describe the representation in figure 5 
as follows: 

letter (ji group{ji , j 2 ,js), sameness .group{group{ji , j 2 , ja) ) , length{group{r x , r 2 ) , 2) , etc.. . 

The situation is now cleanly broken up into objects, attributes, functions, first-order 
relations, second-order relations, and so on. Whether something is an attribute or an 
object is only known because it is entered in the 'right' form, and the program com- 
pletely depends on his syntax. When the program sees the label 'group' it knows this 
is a relation, nothing else, with two terms which are both objects. Although this strict- 
ness is okay (very useful even) for many applications, these borders are very blurry in 
the real world, and if you need to make the distinction, a concept must be able to 
quickly slip into another classification (or even to coexist in multiple ones). For ex- 
ample, if we want to express the largest structure in figure 5, we would have to write 
group{group{m\) 1 group{ri 1 r 2 ) 1 group(j%, j 2 , jz))- This is simply impossible in predicate 
logic, because there is already a group-relation with three terms, and those terms had to 
be objects, not other relations. We would have to invent new labels and new relations to 
make this possible, and tell the system how to deal with this difference, which makes the 
system as whole increasingly complex. Already, we had to declare three different group- 
relations, one with one element, one with two elements, and one with three elements, and 
we had to adjust the system to this. If we want a group of four, once again we would have 
to extend the system. In fluid concept systems, this situation is relieved because a group 
is a structure with any number of elements and descriptions are equally simply attached 
to these structures. 

This gives fluid concept architectures a remarkable flexibility. For example, as shown 
in figure 11, Copycat can make the following analogy: abc — > abd, mrrjjj — > rnrrjjjj. This 
implies it has recognized a successorship of letters as easily (but not always as quickly) 
as a successorship of lengths, and was able to map the description 'bond facet: letter 
category' onto 'bond facet: length' (because both got 'bond facet' as description type, 
which reflects how well Copycat's Slipnet is constructed). Then it simply replaced the 
rightmost element by its 'successor', which in the first case was a alphabetic successor, 
and in the second case a numeric successor, but Copycat easily made abstraction of this. 
Note that Copycat found the answer mrrkkk more often (because it's simpler to find), but 
the temperatures reveal it was much more confident with the answer mrrjjjj. 

Still, Copycat doesn't completely master switching between objects and relations, 
because it internally still draws a clean line between instances and structures. Therefore, 
it cannot recognize instances as being elements of a group, because a description can be 
applied to an instance (letter) or a structure (group), but never to an instance as an 
element of a structure. In figure 5, the descriptor 'rightmost' attached to the rightmost 
letter j, applies solely to its rightmost position in the entire string, not to its rightmost 
position in the group jjj. On the other hand, it recognized effortlessly that the group 
jjj is the rightmost part of the larger group m-rr-jjj. Still, this is not a deep aspect of 
the architecture, and removing this difference might even simplify its functionality. That 
this caveat limits the architecture's capabilities, did become apparent several times, as 
Mitchell [T3][p.l72] compared Copycat's answers to those of human subjects: 

For example, several people gave the answer abc =>■ abd, lmnfghopq =>■ lmof- 
giopr, using the rule "Replace the rightmost letter of each successor group by 
its successor. " Copycat is currently unable to make descriptions such as "right- 
most letter of successor group. " 
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3.4 Parameters and formulas 

Like most emergent architectures, this one is also crawling with parameters and formulas 
that affect its behavior in many ways. Therefore, a large part of building such a system 
consists of finding appropriate values for them. For now, these are set by hand for each 
domain (of course not for each problem), based on intuition, trial and error and a pinch of 
arbitrariness. For those interested, Mitchell [16J describes in detail the values she picked 
while building Copycat. To make it clear exactly which domain-specific knowledge is given 
to the program through these parameters, and how bad values could lead to catastrophes, 
what follows is a short description of the most influential parameters and their (presumed) 
effects on each part of the architecture. These remarks are based on intuition, since a full 
range testing is far beyond the scope of this paper. 

3.4.1 Slipnet parameters 

• Concept nodes and linkages determine the demarcation of the domain and its depth. 
If in Copycat, for example, numbers and the length-concept are not included, the 
program just won't come up with them, failing to notice the length of a group. 
Putting in too much nodes and links might create pressures (by unwanted conno- 
tations) we want to avoid, even if humans have that same problem. As we've seen 
in our TSP, top-down influences have no effect if there's no instance of the concept 
in the workspace, but still, it can disturb the activation pattern. Also, the different 
types of links (as depicted in figure 3) hold necessary information for codelets. Cat- 
egory links for example, are used to define descriptions, and must thus be entered 
wisely. 

• Conceptual depth values influence the system's focus, by keeping deep concepts 
highly activated for a longer time. Setting bad conceptual depths could make the 
system focus on concepts that seem absurd. 

• The highest and lowest values of conceptual depth and amount of activation mark 
the fine-grainedness, or, if negative values arc allowed, they could create 'negative 
pressures', discouraging the architecture to consider particular concepts. This is not 
used in Copycat, but Metacat uses it for themes. 

• Labeled links lengths are dependent on the activation of their labels. Fixed link 
lengths are determined by intuition or (mainly for category links) they can be based 
on the difference between conceptual depths. These are very important because 
they represent the conceptual distance between concepts. If these are set wrong, the 
system will act rather strange, going in directions that seem highly useless. Imagine 
what would happen in our TSP when the distances between cities were set wrong. It 
would seem the routes were made by someone lacking good geographical knowledge. 
Distant concepts must also have large conceptual distances, or your system might 
indeed behave like a bit of a crackpot, making very strange associations. 

• The number of codelets runs before the Slipnet updates could be increased to speed 
up the system, but too large intervals would seriously damage the system's ability to 
focus on the current situation in the Workspace. It would be somewhat like having 
a consciousness disorder and see the world make jumps all the time. This would 
make it very difficult to track what is going on. 

• Some nodes will initially be 'frozen' to counter the effect of the initially added 
descriptions. When you see a new problem, you scan it quickly to see what's the 
best point to start from. Likewise, when the initial pressures have commingled, this 
clamping should be lifted soon enough, otherwise it would distract the system. 

• The activation threshold for nodes to get full activation is usually a probabilistic 
exponential formula. In Copycat, a node must at least have activation 50 (out of 
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100), and then the probability to switch to full activation raises exponentially. This 
has an impact on how quickly the system considers a concept to be important enough 
to let it create top-down pressures (codelets). If this value is set too low, the system 
will be very distracted because a lot of concepts will get full activation, focusing on 
many different concepts at the same time. If it is too high, your system won't focus 
at all, and top-down pressures will hardly arise. 

• The activation decay rate is also very decisive for the system's behavior. Too rapid 
decay means that concepts are not focused on long enough, giving no real direction 
to the system's processing. If the decay rate would be too low, almost all concepts 
in the domain will get high activation after a while, and the system would go in all 
directions at the same time. 

• The activation spreading can also be regulated by a formula. This has a huge impact 
on activation patterns in the Slipnet. As we've just seen, too much or too little 
activation in the Slipnet can seriously bring down the effectiveness of the system. 
The first versions of Copycat had to entirely disregard the shrinking of labeled links 
for spreading activation, because it made the Slipnet too active. 

3.4.2 Temperature parameters 

The temperature is usually calculated as a function of unhappiness in Workspace and 
strength of target structures. This is by far the most difficult formula to set. Since 
the program depends on temperature as a judge of the structures that are built in the 
Workspace, this formula must be a very good representation of the quality of these struc- 
tures. Otherwise, the system might unleash hordes of breaker codelets just when you're 
getting close to a solution. Evaluating the fitness of structures in the current (not ex- 
plicitly described) viewpoint is a difficult task indeed. As the temperature is a guide for 
the stability of your viewpoint, it can have a huge impact. It determines how thoroughly 
existing structures should be tested (if you're close to the solution, assessment of fur- 
ther steps is tightened), how important the urgency of a codelct is deemed (if too little 
is known, you can't trust 'calculated' guesses), and when to be pleased with a certain 
solution (you don't want dull solutions, but you also don't want to go on forever). 

3.4.3 Coderack parameters 

The Coderack also has a few responsibilities. Since it has a certain size, it can delete 
excess workers. In Copycat, this happens by a probabilistic function of urgency and age 
that is executed each time the Coderack runs over. As long as no important workers 
(young ones with high urgencies) are deleted, it should be okay. We specifically want 
to remove workers that are too old, because most likely, the Workspace and pressures 
have changed significantly, making them obsolete or even disturbing. Aging does increase 
a worker's urgency, and we want workers that have gotten behind, to be sent into the 
Workspace quickly, or not at all. Workers must thus get good urgency boosts when they 
age, but when their urgencies would increase too much, they might come in the way of 
younger workers with fresh new ideas. 

3.4.4 Codelets 

There are few rules for writing codelets. Codelets just have to build structures out of 
instances, assign them a strength- value, send activation to the Slipnet and alter the tem- 
perature. You're completely free in their exact implementation. You must however build 
them carefully, since codelets have a huge impact on the system's efficiency and perfor- 
mance. To give the idea, here's how a typical scout-evaluator-builder sequence is written: 
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Bottom-up bond scout: 

1. Choose an object in the Workspace: probabilistic function of salience 

2. Choose another (adjacent) object: probabilistic function of salience 

3. Choose a common basis for binding the two objects. For example, in Copycat, 
codelets choose a bonding facet (letter-category or length) as a probabilistic function 
of its activation and the number of descriptions with it 

4. See if both objects can be bonded by this common basis, i.e., if they both have a 
description reflecting this common base. In Copycat: see if both have a descriptor of 
the chosen bonding facet (e.g. have a description "letter-category: < descriptor >") 

5. If so, see if there exists a link connecting both descriptions. In Copycat: see if these 
two descriptors are linked in the Slipnet (e.g. 'a' and 'b' by a successor-link) 

6. If so, propose a bond between the two objects and call in a bond evaluator codelet 
with an urgency proportional to the conceptual distance of the two objects 

Top-down bond scouts work in the same way, except that they are given a predefined 
bond-related concept from the Slipnet (e.g. a certain bond-category, like successor), and 
the first object is chosen as a probabilistic function of its unhappiness and the number of 
occurrences of that bond (successor) in its surroundings (its string) . 

Bond evaluators are given the proposed bound and calculate its strength to decide to 
give up on it or to call in a bond builder. The bond builder will first check whether the 
proposed bond hasn't been built in the mean time. If not, it will fight any incompatible 
structures. And if it wins, it will destroy all incompatible structures, build the new bond 
and send activation to the involved Slipnet nodes. 

Writing codelets must thus be done very carefully, and requires a whole lot of testing. 
If the strength of structures is calculated badly, the temperature is adapted poorly or a 
wrong amount of activation is sent to the Slipnet, the whole system might go haywire. 
Note that the architecture doesn't strictly enforce splitting up codelets. Someone could 
simply replace the scout-evaluator-builder-chain by a single builder codelet, but this would 
go against the spirit of the parallel terraced scan, and would probably reduce the impact 
of perception. The system would still be guided by Slipnet-spawned codelets, but it could 
suffer some kind of time-lag effect, doing actions that were useful in a previous interpreta- 
tion, but that are irrelevant or contradictory in the present one. This sometimes generates 
strange behavior, as weak initial structures can have a great impact on the direction of 
the program. Mitchell [16j ch.8] discusses the effect of some of these 'lesions' of Copycat. 

There are some higher-level effects you probably won't think about while writing 
codelets, but which may have a lot of impact on the system's behavior. If you are used to 
reading from left to right (as in western cultures), you might unwittingly write codelets 
in such a way that they tend to look from left to right, thus seeing successorship relations 
earlier than predecessorship relations. Likewise, your program will possibly be inclined 
to notice incrementation earlier than decrementation. Also, a philosopher might create 
a different behavior than a computer scientist. It is very likely that no two systems will 
ever fully behave the same, although, if the two programmers arc from the same culture 
and profession, you might not even notice. 
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3.5 Noise resistance and optimization 

Although bad parameters and formulas can seriously hamper the system's effectiveness, 
there's still a lot of elbow room left. Some values are more critical than others, but gener- 
ally, they don't have to be very, very exact. If you use healthy intuition and (a lot of) trial 
and error, you just might get your system running smoothly. In fact, you will never find 
the 'perfect' settings for all parameters, because the architecture is very nondeterminis- 
tic, using many probabilistic formulas. No matter how hard you tweak, the codelets are 
selected probabilistically, and thus different outcomes are possible. We could also remove 
all probabilistic effects, but this would be against the spirit of only forcing your system 
to focus when it's useful. In the beginning, when little is known, there is little use in 
doing tedious calculations to find out which possibility to try first. It's just one of the 
greatest assets of this architecture that it can easily handle uncertainty, so why not use 
it? Nevertheless, some tweaking might be very helpful to achieve better results. One way 
of doing this is of course trial and error, another is using genetic algorithms, and Detre [1] 
very recently devised a way for doing this. Still, these parameters and their effects are 
hardly investigated, and a lot of work still needs to be done to optimize these architectures. 

There are also some problems where it is not clear if bad parameters values alone are 
to blame. The system often exhibits some kind of "concentration disorder", as stronger 
top-down forces from the Slipnet don't seem to be given a fair chance, and stronger 
pressures don't seem to be able to overthrow a weak viewpoint. This can be due to bad 
activation patterns (for example, having too much activation), a bad urgency-assignment, 
or a Coderack behavior that is too stochastic. Still, one can wonder if this flaw isn't also 
stimulated by using an architecture that is too loose in handling and controlling top-down 
influences. Copycat has some "bad grouping" problems, for example finding 2 groups 'ab' 
and 'cd' and not seeing 'abed', or seeing ii-j and j-kk instead of ii-jj-kk. Mitchell gives 
some suggestions as how to solve this, such as "location specific pressures" (as for now, 
a pressure triggers a certain structure to be built anywhere, not just on the spot that 
triggered the pressure), and "subconscious planning mechanisms" (larger structures are 
preceded by necessary prior structures). Focus has really been significantly improved by 
using the additional self- watching mechanisms of Metacat, but still, bad grouping occurs 
regularly. Of course, this could be interpreted as the program being creative, seeing things 
differently, but often bad groupings make no sense at all. 

3.6 Applicability 

As we come to see, fluid concept architecture has some very nice assets. Although its 
solutions are less than optimal and setting it up requires a whole lot of work, it stands 
for a whole new way of treating concepts, relations, representations and high-level pro- 
cessing. Its concepts and their relations resemble very much to concepts as we use them 
in everyday life, fluidly slipping and changing categories. Its representations are built up 
by its own perception making and are very flexible and sensitive to further influences and 
shifting focus. Top-down and bottom-up pressures and their interaction are modeled quite 
naturally, and the continuous play between high-level processing and perception actually 
gives the system a 'drive' towards a solution, instead of fating it to systematic myopia. 

This makes it very useful for problems where all this (or part of it) is required. Cre- 
ativity, for example, is founded on making new perceptions and using far-off conceptual 
associations. This doesn't mean that any program using this architecture will be extremely 
creative, but it opens the door, and sometimes, you see one of these programs do things 
you didn't expect. Copycat has often found solutions that were very appealing, although 
no test person came up with it. The system is very good at problems where different 
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bottom-up and top-down pressures commingle in possibly very difficult ways, something 
few programs can claim. It is also good at solving unknown types of problems (if it is 
well constructed). Copycat was basically designed to solve five target types of analogy 
problems, but it managed to solve many other types of problems as well, although a host 
of advanced types is still out of its reach (more self- watching was needed, and that's why 
Metacat was built). It's also very good at handling uncertainty. If you say 'rose' instead 
of 'flower', it still can handle the problem if both concepts are in the Slipnct. It will even 
behave accordingly, and possibly give you a result that somehow reflects the difference 
between a rose and a flower. All this makes Copycat/Metacat psychologically very re- 
alistic, and its answers really come close to the answers given by human test subjects. 
Mitchell [17 assumes that people who prefer 'inferior' analogies would probably, if alive 
during the Pleistocene, have been eaten by tigers, which explains why there are not many 
such people around today. 

You don't want to use this architecture for problems that require someone to be very 
exact, like solving a theorem. It also won't come up with perfect sequences of actions 
towards a certain goal. In fact, when humans are bad at it, fluid concept architectures 
are most likely also bad at it. Nonetheless, humans often find 'tricks' to circumvent the 
main difficulties in the problem, and this might enable the architecture in the same way. 
For example: 6 * 19 = 6 * 20 — 6 = 114. Fluid concept systems will probably tend to do 
something like that. There is a interesting project, Numbo, based on Jumbo [HI ch.3], 
and designed by Defays [SJ ch.3], that uses this way of going about when trying to reach 
a certain number with a set of other numbers and operators. 

Fluid concept architecture is especially useful if the problem is about finding the 
essences of a situation, and discovering certain structures. This suggests it might be used 
for learning, and in fact the architecture does learn something when if finds the essence of 
a situation, but much more work must be done before it can actually be used for it. On 
the other hand, the architecture has already proved itself in applications were perception 
is very important, such as pattern, letter and image recognition, although thus far only 
in very small domains. It could probably also handle large domains while being very 
resistant to combinatory explosion, which is a very good asset when extensive domains 
are being used, but little tests have been done on that account. Also it can use contextual 
data, and beliefs and goals can have a real influence on its processing. This could be 
envisaged as a drawback, when you want the system to be free of all that, but for many 
purposes this also could be desired. 

Yet, many AI textbooks will warn you that the fact that a program can find a solution 
in principle does not mean that the program contains any of the mechanisms needed to 
find it in practice. Still, a lot can be expected from this architecture, although it will 
probably see many transformations before it actually gets there. 
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4 Relationships with other methodologies 



In this last section, let's widen our view and put some things in perspective. Of course, 
fluid concept architecture wasn't conceived in a vacuum, and the way it relates to other 
approaches in AI and cognitive science can teach us quite a lot about the underlaying 
mainsprings. The lessons the designers of this architecture have drawn from other research 
in different corners of the AI world are an important source of insights, and may one day 
help us to draw further lessons from the architecture itself. Therefore, let's first of all 
sketch the breeding ground in which this architecture developed, and the position it 
eventually took. Afterwards, a short comparison will follow with other programs built for 
making analogies. 

4.1 The symbolic-subsymbolic spectrum 

Since artificial intelligence is such a multidisciplinary research area, it is home to many, 
often very different ideas about how to create intelligent (or) human behavior. One way 
to map it out, is to draw a line from the symbolic to the subsymbolic paradigm. On 
one end, situations are described and objects represented with atomic symbols, both in 
the semantic sense (they refer to categories or external objects, like "car") as in the syn- 
tactic sense (they are operated on by "symbol manipulation") [21]. On the other end, 
descriptions statistically emerge out of a complex pattern of many subsymbols, which are 
distributed entities, such as nodes (and their weights) in a neural network, or classifiers 
in a classifier system (I'll be explaining these soon). They are operated on by complex 
algorithms, often including sending the data several times through neural networks, which 
first have to be 'trained' by feeding them examples. 

In the symbolic paradigm, symbols are basically arbitrary labels (change "car" ev- 
erywhere into "spaceship" and the program will still behave the same), and they are 
mostly controlled by a central top-down control mechanism. The symbols are said to be 
content-defined. In contrast, subsymbols are called origin-defined, because they are often 
associated with a continuous numerical value, originating from sensory data (for exam- 
ple the intensities observed (amplitude, voltage) for each frequency as a certain word is 
spoken, or each pixel in an image), or simply any useful encoding of the objects observed 
in a certain domain (for example, car=[0.245, -0.455, 0.593, ...]) since it doesn't really 
matter how the word "car" sounds, as long as we can uniquely identify it. Likewise, en- 
tire sentences or other structures can be encoded and processed. This processing happens 
bottom- up (for example sending these values through a neural network), and usually no 
central control mechanism is used. This time, two (sub)symbols can't simply be switched, 
because this would seriously affect the behavior of the program using these subsymbols. 

Chalmers, French and Hofstadter [5] state that there exists a "meaning barrier" be- 
tween these two paradigms, that has rarely been crossed by work in AI: 

On one side of the barrier, some models in low-level perception have been ca- 
pable of building primitive representations of the environment, but these are 
not sufficiently complex to be called "meaningful". On the other side of the 
barrier, much research in high-level cognitive modeling has started with rep- 
resentations at the conceptual level, such as propositions in predicate logic or 
nodes in a semantic network, where any meaning that is present is already built 
in. 

The latter problem is often referred to as the " symbol-grounding problem" , since the 
symbols have no "feeling" with the observable object (and its surroundings) that they 
represent. In contrast, the subsymbolic approach has the "variable-binding problem", 



36 




Figure 15: The symbolic-subsymbolic spectrum. 



because there is no clear relationship between the observed concept 'tokens' and their 
platonic concept 'types'. The normal place for types and tokens to meet would be some 
kind of workspace, but since the workspace has never had a clear neurological counterpart, 
it is seldom used in subsymbolic architectures, and both types and tokens are represented 
in a single network. Very recently though, an article in Nature reported the surprising 
discovery of a working memory for visual data, so this might just give a new direction to 
subsymbolic research. 

Fluid concept architecture is situated "somewhere in between". Since this doesn't 
teach us much, we need a further refinement of our one-dimensional map. Based on 
three major issues in the symbolic-subsymbolic discourse, Blank, Meeden and Marshall [1 
spread out the paradigm space along three axes, based on representation, composition 
and functionality, as depicted in figure 15. Notice that fuzzy and probablistic theories are 
used by both extreme paradigms to broaden their representation, while hybrids are often 
floating somewhere in between. 

4.2 Positioning Fluid Concept Architectures 

In this three-dimensional cube, fluid concepts are placed in the upper left corner. We will 
now discuss why, and how we should interpret this. 
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4.2.1 A largely symbolic representation 

The concepts in the Slipnet are atomic, discrete, static, and content-defined. The names 
of the nodes are also quite arbitrary. The system wouldn't mind just naming them nodel, 
nodc2 and so on, since the exact names are never used, and are just an (English) facade 
for human convenience. Still, the way symbols are used awards them at least some degree 
of meaning, thanks to their correlation with actual phenomena, even if those phenomena 
take place in a tiny and artificial world [8, p. 290]. Concepts are put in an intimate rela- 
tionship with each other in the Slipnet, and this coherence is heavily used in building up 
the representations (descriptions and structures) of the perceived tokens. The meaning 
of a concept is not simply encapsulated in a node. If you activate a node, the activation 
spreading creates a sort of halo surrounding a concept, activating other conceptually close 
concepts. Every node inside the halo of another concept is treated as a part of it. It too 
will create top-down codelets, it too has a chance of serving as a descriptor for the ob- 
served concept. The token that gets descriptions, can use these descriptions as "feelers" 
with the concept as it evolves and interacts. If the concept is deemed temporarily impor- 
tant, so will be the token. If a structure's description concepts fade away, it will also get 
weaker and weaker. To at least some extent, the symbols must thus be granted some kind 
of grounding, and is the meaning barrier crossed. 

This halo also makes the concept look a bit distributed, as a concept is usually spread 
out over different nodes. In a scientific context, the halo of the concept 'flower' will be 
stretched towards scientific details (stamen, pollen,...), while in a poetic context, the halo 
would probably stretch towards more qualitative attributes (color, beauty,...). But this is 
not quite distributed in the way subsymbolic entities are, since there, meaning is not ever 
contained inside any number of nodes, but rather in the effect the processing through a 
neural network has. 

4.2.2 A very subsymbolical functionality 

The latter means that cognitive phenomena in subsymbolic systems are emergent statis- 
tical effects of a vast array of small distributed events without any overarching control 
mechanism. But also in fluid concept systems, fine-grained parallelism, commingling of lo- 
cal pressures, spreading activation and halo's of emergent concepts are deemed indispens- 
able in achieving the flexibility we want it to display. In its functionality, fluid concepts 
thus have a strong subsymbolic flavor. An important philosophical difference however, is 
that subsymbolic approaches take pride in their neurological relevance, while fluid concept 
architecture only seeks psychological realism. It is based on many psychological findings 
(mostly in visual experiments), like the separation between a Platonic-concept memory 
and a workspace where instances reside |5J p. 293]. They simply assume the existence of 
something like Copycat's workspace and investigate how a spreading-activation network 
with context-dependent and permanent concept types can interact with this working area. 
Starting from an the apparent nature of concepts, with operating principles lying some- 
where in between the symbolic and subsymbolic paradigms, provides a kind of shortcut 
that makes concepts manageable, without oversimplifying them. In return, this means 
that the functionality of a system processing these concepts, must likewise be able to han- 
dle their distributed (context-dependent) nature. One very solid way to do this, is indeed 
to incorporate perception as a fundamental part of this processing, and this requires a 
very 'subsymbolic' functionality. It may at first seem that using a deeply realistic model 
of concepts and incorporating perception are two quite isolated aspects of the architec- 
ture, but most likely, the integration of perception is, apart from being very interesting 
by itself, also a logical consequence (maybe even a necessity) at a deeper level. 
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Parallelism and spreading activation and so on are of course not techniques that can 
only be used in subsymbolic programs, but they do have such a flavor because of the ubiq- 
uity they have in that approach. More distinguishing are the microsemantic and holistic 
aspects. In a microsemantic approach, semantics are internal to the representation, while 
in a macrosemantic approach, they are external to it (contained in a rule base for exam- 
ple). Fluid concept systems are thus microsemantic, since the codelets use only semantic 
information contained in the representation. There is some processing-based informa- 
tion stored in codelets, but what a specific representation means is captured internally 
(through structures and descriptions). An approach is holistic if it operates on represen- 
tations without decomposing them, and atomistic otherwise. In predicate logic, one often 
takes an object out of a relation, operates on it, and then and puts it back, the structure 
holding the object is hereby decomposed. When a distributed symbol is sent through a 
neural net, its values can all change, but the operation is executed on the representation 
as a whole, and is therefore holistic. In fluid concept systems, an instance is never put 
into a structure, it is only described that way by descriptions. A worker thereby always 
works on a structure as a whole, it doesn't seek to operate on its constituents. 

4.2.3 A largely symbolic composition 

Since subsymbolic distributed patterns usually cannot expand or contract, every struc- 
ture you want to build with this representation must be superimposed. When a structure 
has been built with a subsymbolic representation, this is apparent in the pattern of the 
values of the different subsymbols, but not as an explicit structure where the represen- 
tation is part from. This is contrary to the symbolic concatenation of symbols, where 
a symbol can simply become part of a relation or can get attributes. In fluid concept 
systems, an instance can get descriptions, or can be bonded in a structure. In this re- 
gard, it thus has a very symbolic nature. Another aspect of subsymbolic composition, is 
that it is heavily context-sensitive. Although the processing of fluid concept systems is 
context-sensitive, in that the Slipnct coherence will be different in different contexts, its 
representation is only mildly affected by this, through descriptions. This is a different level 
of context-dependency than the one present in the subsymbolic paradigm. For example, 
in subsymbolic approaches, the object waiter in the sentences he gave it to the waiter 
and she handed it to a waiter, will be represented differently in both cases, although both 
representations in fact propose the same thing, and would be represented that way in our 
architecture. A difference however with symbolic systems, is that structural components 
don't have to be either present or absent. In fluid concept architectures structures can 
be proposed, but not yet built. This tentative presence of a structure feels somewhat 
subsymbolic, since in that view, structures can be partially present. 

4.3 Related Architectures 

Both in the symbolic and subsymbolic corner, there exist some programs whose archi- 
tectures greatly influenced (parts of) the architecture described here. Historically, these 
could be interpreted as a series of great battles fought while, in parallel, the architec- 
tural ideas behind Copycat were developing. Building Copycat, basically the mother of 
our architecture, took about a decade of research, from around 1984 till 1993, and it is 
very interesting to see what was happening in the background. Only the most impor- 
tant projects with a direct influence on the architecture (not on analogy-making), will be 
discussed. 
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4.3.1 Hearsay-II (Erman et al., 1980) 



The speech-understanding system Hearsay-II was a great source of inspiration for Copy- 
cat and its predecessors. A spoken waveform was interpreted through the cooperation of 
various knowledge sources, each responsible for a specific task. These tasks comprised di- 
viding the waveform into segments, creating phones and phonemes out of these segments, 
proposing syllable-class hypotheses from phonemes, and proposing word hypotheses from 
syllables (notice the first two work bottom-up, while the latter two work top-down). These 
knowledge sources then build corresponding data structures on a global blackboard. The 
intuition that perception-making called for different agents to deal with different levels 
of analysis was clearly carried further in the play of codelets on the Workspace, but then 
much more fine-grained, as codelets have localized, microscopic responsibilities compared 
to those of knowledge sources (mainly to support the scouting-behavior) . Also the build- 
ing of structures on a blackboard as a means of communication between top-down and 
bottom-up working agents was fully adopted. 

In Hearsay-II, agents called demons, consisting of a host of conditions, preconditions 
and pre-preconditions survey the scene, eagerly searching for a valid pretext to call in 
a knowledge source. This role is preserved by splitting up an codelet in a scout, an 
evaluator and a builder. The difference however is that in Hearsay, clashing hypotheses 
can freely coexist, creating a large number of possible combinations, while in Copycat an 
evaluator codelet is allowed to tear down clashing structures if its structure is deemed 
to fit better than the existing ones. Psychologically, neither is fully realistic. Humans 
can't keep every possible hypothesis in mind, but neither do they simply forget alternate 
views. Metacat, by using its Temporal Trace, does have some means to improve on this, 
but doesn't come fairly close to the flexibility displayed by humans. Another difference is 
that Hearsay-II uses a central scheduler that assigns a priority to each knowledge source, 
as an estimate of their usefulness in the overall goal of the program. There is thus a 
central, non-probabilistic control mechanism. This means that Hearsay-II tries to make 
intelligent decisions, while in Copycat, no single small action has absolute importance, as 
intelligence has a more emergent nature. 



4.3.2 Jumbo (Hofstadter, 1983) 

Jumbo [5J ch.2], a program trying to make English-sounding words out of a series of let- 
ters, introduced two key elements of Copycat's architecture. First of all, it extended the 
blackboard of Hearsay to behave more like the cytoplasm of a living cell. We've already 



explained how this worked in section 2.5 At that time however, there was not yet a 
concept of genes in DNA [nodes in the Slipnet]. Instead, since only some molecule bonds 
are possible, and each bond has its own strength, all useful combinations of letters were 
described in a chunkabet, a list of possible bonds with a strength- value, depending on 
the position the bond has in a word. For example, one entry is: 'st: initial 8, final 4 '■ 
This means the bond s-t has a strength-value of 8 if it occurs in the beginning of a word, 
and a value 4 if it occurs at the end (also middle-strength values are sometimes given). 
These values could be automatically generated out of a database with a large number of 
words, but in Jumbo they're described based on intuition. Codelets and the Coderack 
were already defined much as they are now, with urgency- values and a probabilistic se- 
lection (but without top-down codelets spawn by the Slipnet, since it didn't exist). They 
were sent into the cytoplasm and making bonds between letters, assessing their strengths 
based on the chunkabet. The temperature was fully introduced as a fundamental part 
of the workings of enzymes in the cytoplasm. The second key contribution was the full 
introduction of the parallel terraced scan, as reflected by splitting up codelets in scouts, 
evaluators and builders. Letters also had a happiness-value (then only based on existing 
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bonds, because descriptions didn't yet exist), and under the motto "the squeaky wheel 
gets the oil", codelets were attracted towards those unhappy letters. 

4.3.3 Seek- Whence (Meredith, 1986) 

Seek- Whence tries to find underlying regularities in a sequence of integers, for example 
1121231234... It was heavily based on Jumbo, but instead of a chunkabet, it introduced 
a first version of the Slipnet, a semantic network, which also had some 'slipping links', 
allowing a concept to slip into another. At that time however, concepts didn't have any 
activation or depth, and links didn't have a conceptual distance. Also, the Slipnet didn't 
throw top-down codelets. Generally speaking, the program's focus was on the processing 
through codelets, but less on the conceptual aspects. There were many special-purpose 
domain-specific codelets, and a concept wasn't supposed to slip until the codelet processing 
got jammed up. It thus did introduce the notion of pressures, but in a much more limited 
way. Copycat moved the focus towards the perceptual and conceptual aspects, and to 
the play of top-down and bottom-up pressures, hereby closing the feedback loop between 
perceptual and conceptual activity. Seek- Whence had a great impact on Copycat by a 
first definition of analogy-making with the Jumbo-architecture, and by pointing out many 
shortcomings that still had to be tackled to ensure flexible processing. 

4.3.4 Classifier-Systems (Holland et al., 1986) 

A subsymbolic counterpart of large aspects of Copycat's architecture can be found in 
classifier systems, which therefore might have had (and may yet have) an effect on the de- 
velopment of the conceptual aspects (parallel scan, top-down and bottom-up pressures,...) 
of Copycat. A classifier-system is composed of a large number of simple agents called clas- 
sifiers. It is connected to the environment through two interfaces (one for input and one 
for output), and communicates with it using messages. Classifiers effectively classify mes- 
sages, they decide what to do in response to them. They can interact by sending messages 
to other classifiers or to act on the environment. In the latter case, the environment can 
respond to this by rewarding the classifier. Since this program is subsymbolic in nature, 
concepts are distributed over a number of classifiers. There is of course no central control- 
ling mechanism, and its behavior emerges from thousands cooperative and competitive 
interactions amongst classifiers. If a classifier produces beneficial messages for the system, 
is it strengthened by an algorithm called the bucket brigade, and is then probabilistically 
more likely to defeat other competing classifiers. Another algorithm, called the genetic 
algorithm, creates a sort of natural selection amongst classifiers. Weak classifiers can 
die out and strong ones can even recombine with other classifiers, thereby passing on 
their genes. The combination of these two algorithms enables the system to adapt to the 
environment . 

4.4 Analogy-making compared 

Since this paper has been discussing an architecture, supposedly (and to some extent 
proved to be) capable of doing more than just analogy-making, I've been focusing on how 
it performs against other architectural paradigms. We've seen that the incorporation of 
perception and the in-depth treatment of concepts gives it some considerable advantages 
over most symbolic approaches in terms of flexibility, mcaningfulness and creativity. On 
the other hand, thanks to its symbolic representation and composition, it is more readily 
understandable, and thus easier to work with compared to the subsymbolic approach. 
Yet, this might feel rather intangible, so, as an illustration, let's look at one specific and 
recent program in each paradigm. Generally speaking however, most other programs will 
do a very good job, but will lack the extra qualities (independence, creativity, flexibility, 
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meaningfulness, psychological aspects,...) fluid concept architecture has to offer, which 
mostly are out of the scope of most research. Also, while many models will focus on simu- 
lation of a human cognitive faculty, fluid concept systems will focus on a novel generation 
of that faculty. 

For a full, up to date, but still very short overview of analogy-making along the 
symbolic-subsymbolic spectrum, see French [5]- A somewhat lengthier description, focus- 
ing on the most important projects, has been written by French and Kokinov [BJ. For a 
full-fledged comparison of Copycat in particular (from its designers' point of view) with 
the programs discussed below, see [Sj ch.6]. 

4.4.1 A symbolic model: SME 

The Structure Mapping Engine (SME), an implementation based on the Structure Map- 
ping Theory by Gentner, has been the most influential computational model of analogy- 
making to date. As its name might reveal, it builds a mapping between two situations 
based on the structure of their representations, these representations being described in 
predicate calculus. Consider figure 16, in which a structural resemblance (in thick lines) 
was discovered between the Solar system [source] and the Rutherford atom [target]. The 
analogy can than be completed by a transfer of the remainder of the source structure to 
the target (in dashed lines), in this case finding that the cause of the electron revolving 
around the nucleus is its charge. 
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Figure 16: SME at work. 
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This model has some very strong assets. It employs explicitly the notion that knowl- 
edge is structured. It has a high psychological relevance, since numerous psychological 
experiments have confirmed that this relational mapping is crucial in producing convincing 
analogies. Its strongest asset is that it is generally applicable to all domains of analogical 
comparison. On the other hand, one can state that this generality is only an illusion. 
Since semantic connections play no role in its processing, it cannot separate relevant data 
from isolated details, and thus must be fed exactly the data relevant to the problem. A 
simple decoy may totally disrupt the processing. Fluid concept systems don't have this 
caveat because perception and high-level processing are intertwined. Also, one might ob- 
ject that conceptual similarity and slippage are very important for analogy-making, while 
these aren't present at all in SME. Towards representational flexibility, SME 'suffers' the 



unambiguity issues discussed in section 3.3 Another remark is that it always deems the 
mapping with the highest degree of systematicity to be the best one, and thus does not 
support a healthy competition between rival views. And last but not least, SME is also a 
brute-force system, which is considered to be psychologically unrealistic. 



4.4.2 A connectionist model: ACME 

In an Analogical Constraint Mapping Engine (Holyoak & Thagard) one first defines a set 
of constraints that have to be simultaneously satisfied, like structural similarity, semantic 
similarity and pragmatic importance. Next, the representations for the source and target 
are entered. The system then builds a constraint-satisfaction network (a kind of neural 
net), where each node corresponds to a possible pairing hypotheses for each element of 
both source and target. So, if you want to map an airplane on a car, the system creates 
hypotheses nodes like 'left wing — > left front wheel', but also 'left wing — ► seat-belt' and so 
on. Then, special links (excitatory as well as inhibitory) are added between these nodes to 
represent the proposed constraints. By activating this network, eventually an equilibrium 
state will be reached, and the best (most active) set of consistent mapping hypotheses 
wins. This way, all possible mappings are evaluated in parallel, and the best one is always 
found. It also can involve semantics if a 'semantic unit' is added. A descendant of ACME, 
namely LISA (Hummel & Holyoak) introduces a time axis so that patterns of activation 
can oscillate over time, and patterns of activation oscillating in synchrony are considered 
to be bound together. 

Some remarks however are that trying out all syntactically plausible pairings is likely 
to become computationally infeasible and psychologically implausible in real-world situ- 
ations, that it's too dependent on representations, which are also rigid and have to be 
tailored specially for each new analogy, and that its semantic unit is 'frozen', thus not 
adapting to any pressures, as happens in fluid concept systems. 



4.4.3 A hybrid model: AMBR 

In the last decade, many hybrid models are showing up (see Stamou, Vogiatzis & Strove [22 
for a classification). Kokinov's Associative Memory-Based Reasoning system for example, 
uses many agents carrying out small symbolic tasks, behaving like codelets in many ways, 
but linked together to form a network in which each agent has a time- varying degree of 
activation, including spreading activation to neighboring agents and activation decay over 
time. An agent's activation represents its perceived relevance and determines the speed 
with which it executes itself, stopping completely if the activation falls below a certain 
threshold. The system has no workspace, and thus no building of perceptual structures 
ever takes place. On the other hand, it docs have an episodic memory. Although it holds 
many exciting ideas, the symbols it uses are once again very hollow, without true semantic 
value, and no high-level perception is done at all. 
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5 Conclusion 



It's widely believed that the world as we see it is a reconstruction, based on sensory data 
and information somehow stored in our brain (through experience and/or heritage). Ev- 
erything we see, hear, feel,... gets recognized or otherwise adopted as a new or refined 
concept. But since every situation is different, recognition is not a mere matter of rigidly 
applying predefined, static concepts to describe aspects of an uninterpreted situation. An 
essential part of the recognition process is a mutual accommodation of one's conceptual 
background with one's developing mental representation of the situation at hand [16 . 
This means that these concepts must be adaptable to different situations, that this men- 
tal representation must be at least partially rebuilt for every changing situation, and that 
conceptual and perceptional processing must somehow influence each other. If you're 
watching a computer screen and you feel a smooth round object that fits in your palm, 
it is most likely a computer mouse. But if you move this object and nothing happens on 
your screen, this creates a pressure causing the concept 'mouse' to slip into something 
closely related, maybe into a round paperweight that you use on your desk. This interac- 
tion thus creates a drive toward conceptually plausible interpretations. You can suppose a 
sentence's structure and meaning by reading the first few words, someone's intentions by 
a few words and the intonation in her or his voice, and a flower perhaps by just catching 
a glimpse of its shape and color. In unexpected situations you may also be entirely on the 
wrong track, in which case we may have to start the process over again, perhaps starting 
from (or looking for) other information. 

Most research in artificial intelligence will handle perception as something completely 
isolated from conceptual processing (mostly entered at many levels by its programmers), 
the concepts used are mostly brittle, rigid and inextensible objects, and problems are 
mostly solved by trying every possibility (unless it can be ruled out). Fluid concept ar- 
chitectures however do their own perception making (and representation building), using 
slipping concepts in an activation spreading network, a workspace where mental repre- 
sentations are formed, and tiny agents proposing perceptual structures given the data 
(bottom-up) or trying to translate conceptual pressures into perceptual structures in this 
workspace (top-down). These structures then in turn affect the activation of related con- 
cepts. Since not every possibility is tried, but only the most probable interpretations are 
followed, one has to be very careful which way to go. To ensure different pathways or 
interpretations are given equal chances, this processing happens in (simulated) parallel 
and uses a terraced scan, which lets different mental structures hypothetically coexist to 
see if they can survive in the context, before actually building them and letting them join 
in the game. This results in a viewpoint that grows stronger and stronger, unless the 
processing gets stuck, in which case some structures are broken down to change course. 
All this is guided by self-watching mechanisms, like a self-regulated temperature, which 
measures the confidence the program has in the current interpretation. 

Unfortunately, fluid concept architectures are as complex as they sound, consisting 
of many many parameters, and they seldom lead to optimal solutions. The good news 
is that they offer some advantages that are very rare. They don't need programmers to 
do the perception for them (once a domain is described). They are extremely flexible, 
adapting effortlessly to new situations. They take conceptual pressures and contextual 
data into account. And the fact that they produce their own representations and extract 
their own essences out of a situation makes that they often exhibit a reasonable degree of 
creativity, coming up with very appealing, yet unexpected answers. In a way, they also 
behave quite human, letting things slip or doing things that seem to go nowhere (but then 
again you never know). Whether you think this is charming or just something to avoid, 
these models certainly offer some fresh new ideas which sentence you to think. 
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